The cluster application availability (CAA) subsystem tracks the state of members and resources (such as networks, applications, tape drives, and media changers) in a cluster. CAA monitors the required resources of application resources in a cluster, and ensures that applications run on members that meet their needs.
This chapter covers the following topics:
When to use CAA (Section 2.1)
Creating resource profiles (Section 2.2)
Writing action scripts (Section 2.3)
Creating user-defined attributes (Section 2.4)
Registering resources (Section 2.5)
Starting application resources (Section 2.6)
Relocating application resources (Section 2.7)
Balancing application resources (Section 2.8)
Stopping application resources (Section 2.9)
Unregistering application resources (Section 2.10)
Displaying CAA status information (Section 2.11)
Using graphical user interfaces to manage CAA (Section 2.12)
Learning CAA a tutorial (Section 2.13)
Creating highly available applications examples (Section 2.14)
CAA is designed to work with applications that run on one cluster member at a time. If the cluster member on which an application is running fails, or if a particular required resource fails, CAA relocates or fails over the application to another member that either has the required resources available or on which the required resource can be started.
Multi-instance applications may find it more useful to use a cluster
alias to provide transparent application failover.
Typically,
multi-instance applications achieve high availability to clients by using
the cluster alias as discussed in
Chapter 3.
However,
CAA is useful for multi-instance applications because it allows for
simplified, central management (start and stop) of the applications and
restarting of an instance on failure.
Using CAA gives you the added value of
automatic application startup and shutdown at boot time or at shutdown time,
without having to add additional
rc3
scripts.
See the TruCluster Server
Cluster Administration
manual for a general discussion
of the differences between the cluster alias subsystem and CAA.
Also,
see
Chapter 3
for examples of how to use the default cluster
alias with multi-instance applications for high availability.
2.2 Resource Profiles
A resource profile is a file containing attributes that describe how a resource is started, managed, and monitored by CAA. Profiles designate resource dependencies and determine what happens to an application when it loses access to a resource on which it depends.
There are four resource profile types:
application,
network,
tape, and
changer.
Each resource type has its own kind of resource profile in which
resource attributes are defined.
The examples and tables in the following
sections show each type of resource profile and the attributes available in that
profile.
For detailed descriptions of what defines each resource type,
including a complete list of profile attributes that can be defined, see
Section 2.2.2,
Section 2.2.3,
Section 2.2.4, and
Section 2.2.5.
Some of the attributes that you can specify in a resource profile are:
Resources that are required by the application
(REQUIRED_RESOURCES).
CAA relocates or stops an
application if a
required resource becomes unavailable.
Rules for choosing the member on which to start or
restart the application (PLACEMENT).
A list of members, in order of preference, to favor when
starting
or failing over an application (HOSTING_MEMBERS).
This list is used if the placement policy (PLACEMENT) is
favored
or
restricted.
All resource profiles are located in the clusterwide directory,
/var/cluster/caa/profile.
The file names of
resource profiles take
the form
resource_name
.cap.
The CAA commands refer to the resources only by the resource name
resource_name.
There are required and optional profile attributes for each type
of profile.
The optional profile attributes may be left unspecified in the
profile.
Optional profile attributes that have default values are merged at
registration time with the values stored in the template for that type and the
generic template.
Each resource type has a template file that is stored in
/var/cluster/caa/template, named
TYPE_resource_type.cap, with
default values for attributes.
A generic
template file for values that are used in all types of resources is
stored in
/var/cluster/caa/template/TYPE_generic.cap.
The examples in the following sections show the syntax of a resource
profile.
Lines starting with a pound sign (#) are treated as comment lines
and are not processed as part of the resource profile.
A backslash (\) at
the end of a line indicates that the next line is a continuation of the
previous line.
For a more detailed description of profile syntax, see
caa(4)2.2.1 Creating a Resource Profile
The first step to making an application highly available is to create a resource profile. You can use any of the following methods to do this:
Use the
caa_profile
command
Access SysMan (/usr/sbin/sysman caa).
This method does not support setting scheduled rebalancing or
failback of applications.
Copy an existing resource profile in
/var/cluster/caa/profile
and edit the copy with
emacs,
vi,
or some other text editor
You can combine any of these methods.
For example, you can use the
caa_profile
command to create a resource profile
and then use a
text editor to manually edit the profile.
You can find several example profiles in the
/var/cluster/caa/examples
directory.
After you create a resource profile, you must register it with CAA
before a resource can be managed or monitored.
See
Section 2.5
for a description of how to register an application.
2.2.2 Application Resource Profiles
Table 2-1
lists the application profile
attributes.
For each attribute, the table indicates whether the attribute is
required, its default value, and a description.
Table 2-1: Application Profile Attributes
| Attribute | Required | Default | Description |
TYPE |
Yes | None | The type of the resource.
The type
application
is for application resources. |
NAME |
Yes | None | The name of the resource. The resource name is a string that contains a combination of letters a-z or A-Z, digits 0-9, or the underscore (_) or period (.). The resource name cannot start with a period. |
DESCRIPTION |
No | Name of the resource | A description of the resource. |
FAILURE_THRESHOLD |
No | 0 | The number of failures detected within
FAILURE_INTERVAL
before CAA marks the resource as
unavailable and
no longer monitors it.
If an application's check script fails this
number of times, the application resource is stopped and set offline.
If the
value is zero (0), tracking of failures is disabled.
The maximum value is
20. |
FAILURE_INTERVAL |
No | 0 | The interval, in seconds, during which CAA applies the failure threshold. If the value is zero (0), tracking of failures is disabled. |
REQUIRED_RESOURCES |
No | None | A white-space separated, ordered list of resource names that this resource depends on. Each resource to be used as a required resource in this profile must be registered with CAA or profile registration will fail. For a more detailed explanation, see Section 2.2.2.1. |
OPTIONAL_RESOURCES |
No | None | A white-space separated, ordered list of optional resources that this resource uses during placement decisions. Up to 58 optional resources can be listed. For a more complete explanation, see Section 2.2.2.3. |
PLACEMENT |
No | balanced |
The placement policy
(balanced,
favored, or
restricted) specifies how CAA chooses
the cluster member on which to start the resource. |
HOSTING_MEMBERS |
Sometimes | None | An ordered, white-space separated list
of cluster members that can host the resource.
This attribute is required
only if
PLACEMENT
equals
favored
or
restricted.
This attribute must be empty if
PLACEMENT
equals
balanced. |
RESTART_ATTEMPTS |
No | 1 | The number of times CAA will attempt to restart the resource on a single cluster member before attempting to relocate the application. A value of 1 means that CAA will only attempt to restart the application once on a member. A second failure will cause an attempt to relocate the application. |
FAILOVER_DELAY |
No | 0 | The amount of time, in seconds, CAA will wait before attempting to restart or fail over the resource. |
AUTO_START |
No | 0 | A flag to indicate whether CAA should automatically start the resource after a cluster reboot, regardless of whether the resource was running prior to the cluster reboot. When set to 0, CAA starts the application resource only if it had been running before the reboot. When set to 1, CAA always starts the application after a reboot. |
ACTION_SCRIPT |
Yes | None | The resource-specific script for
starting, stopping, and checking a resource.
You may specify a full path for
the action script file; otherwise, the path
/var/cluster/caa/script
is assumed.
You may also
specify a relative path with this default path as the starting point. |
ACTIVE_PLACEMENT |
No | 0 | When set to 1, CAA will reevaluate the placement of an application on addition or restart of a cluster member. |
SCRIPT_TIMEOUT |
No | 60 | The maximum time, in seconds, that an action script may take to complete execution before an error is returned. |
CHECK_INTERVAL |
No | 60 | The time interval, in seconds, between repeated executions of the check entry point of the resource's action script. |
REBALANCE |
No | None | A time at which the application will be
automatically
reevaluated for optimal placement.
The field must be specified in the
form
t:day:hour:min, where
day
is the day of the week (0-6),
hour
is the hour of the day (0-23), and
min
is the minute of the hour (0-59) when the reevaluation occurs.
An
asterisk may be used as a wildcard to specify every day or every hour.
|
The following example creates an application resource with CAA
using
caa_profile:
# /usr/sbin/caa_profile -create clock -t application -B /usr/bin/X11/xclock \ -d "Clock Application" -r network1 -l application2 \ -a clock.scr -o ci=5,ft=2,fi=12,ra=2,bt=*:12:00
The contents of the resource profile file that was created by the previous example are as follows:
NAME=clock TYPE=application ACTION_SCRIPT=clock.scr ACTIVE_PLACEMENT=0 AUTO_START=0 CHECK_INTERVAL=5 DESCRIPTION=Clock Application FAILOVER_DELAY=0 FAILURE_INTERVAL=12 FAILURE_THRESHOLD=2 REBALANCE=t:*:12:00 HOSTING_MEMBERS= OPTIONAL_RESOURCES=application2 PLACEMENT=balanced REQUIRED_RESOURCES=network1 RESTART_ATTEMPTS=2 SCRIPT_TIMEOUT=60
For more information on the application resource profile syntax,
see
caa_profile(8)caa(4)2.2.2.1 Required Resources
CAA uses the required resources list, in conjunction with the
placement
policy and hosting members list, to determine which cluster members are
eligible to host the application resource.
Required resources must be
ONLINE
on any member on which the application is
running or started.
Only
application resources can have required resources, but any type of
resource can be defined as a required resource for an application
resource.
A failure of a required resource on the hosting member causes CAA
to initiate failover of the application or to attempt to restart it on the
current member if
RESTART_ATTEMPTS
is not 0.
This can cause
CAA to fail the application resource over to another member which provides
the required resources, or to stop the application if there is no suitable
member.
In the latter case, CAA continues to monitor the required resources and
restarts
the application when the resource is again available on a suitable
cluster member.
Required resources lists can also be useful to start, stop, and
relocate a group of interdependent application resources when the
caa_start,
caa_stop, or
caa_relocate
commands are run with the force (-f) option.
2.2.2.2 Application Resource Placement Policies
The placement policy specifies how CAA selects a cluster member on which to start a resource, and where to relocate the resource after a failure.
Note
Only cluster members that have all the required resources available (as listed in an application resource's profile) are eligible to be considered in any placement decision involving that application.
The following placement policies are supported:
balanced
CAA favors starting
or restarting the application resource on the member currently running the
fewest application resources.
Placement based on optional resources is
considered
first.
See
Section 2.2.2.3.
Next, the host with the
fewest
application resources running is chosen.
If no cluster member is favored
by these criteria, any available member is chosen.
favored
CAA refers to the
list of members in the
HOSTING_MEMBERS
attribute of the
resource
profile.
Only cluster members that are in this list and satisfy the
required
resources are eligible for placement consideration.
Placement due to
optional
resources is considered first.
See
Section 2.2.2.3.
If no member can be chosen based on optional resources, the order of the
hosting
members decides which member will run the application resource.
If none
of the members in the hosting member list are available, CAA favors placing
the application resource on any available member.
This member may or may not
be included in the
HOSTING_MEMBERS
list.
restricted
Like
favored
except that, if none of the members on the
hosting list are available,
CAA will not start or restart the application resource.
A
restricted
placement policy ensures that the
resource will never run on a
member that is not on the list, even if you manually relocate it to that
member.
You must specify hosting members in the
HOSTING_MEMBERS
attribute to use a
favored
or
restricted
placement policy.
You must not specify hosting members in the
HOSTING_MEMBERS
attribute with a
balanced
placement
policy, or else the resource will not validate and cannot be
registered.
If
ACTIVE_PLACEMENT
is set to 1, the placement
of the application resource is reevaluated whenever a cluster member is
either added to the cluster or the cluster member restarts.
This allows
applications
to be relocated to a preferred member of a cluster after the member
recovers from a failure.
To have an application relocate to a preferred member at a time
other than when the cluster member rejoins the cluster, use the
REBALANCE
attribute to specify a time at which
placement can be reevaluated.
2.2.2.3 Optional Resources in Placement Decisions
CAA uses optional resources to choose a hosting member based on
the number of optional resources that are in the
ONLINE
state on each hosting member.
If each member has an equal number of optional
resources in the
ONLINE
state, CAA considers the order of
the optional resources as follows:
CAA compares the state of the optional resources on each member
starting
at the first resource and proceeding successively through the list.
For
each consecutive resource in the list, if the resource is
ONLINE
on one member, any member that does not have the resource
ONLINE
is removed from consideration.
Each resource
on the list is evaluated
in this manner until only one member is available to host the resource.
The maximum number of optional resources is 58.
If this algorithm results in multiple favored members, the
application
is placed on one of these members chosen according to its placement
policy.
2.2.3 Network Resource Profiles
Table 2-2
describes the network profile
attributes.
For each attribute, the table indicates whether the attribute is
required,
its default value, and a description.
Table 2-2: Network Profile Attributes
| Attributes | Required | Default | Description |
TYPE |
Yes | None | The type of the resource.
The type
network
is for network resources. |
NAME |
Yes | None | The name of the resource. The resource name is a string that contains a combination of letters a-z or A-Z, digits 0-9, or the underscore (_) or period (.). The resource name cannot start with a period. |
DESCRIPTION |
No | None | A description of the resource. |
SUBNET |
Yes | None | The subnet address of the network
resource in
nnn.nnn.nnn.nnn
format (for example,
16.140.112.0).
The
SUBNET
value is the bitwise
AND
of the IP address and the netmask.
If you consider an IP address of
16.69.225.12 and a netmask of 255.255.255.0, then the subnet will be
16.69.225.0. |
FAILURE_THRESHOLD |
No | 0 | The number of failures detected within
FAILURE_INTERVAL
before CAA marks the resource as
unavailable and
no longer monitors it.
If an application's check script fails this
number
of times, the application resource is stopped and set offline.
If the
value
is zero (0), tracking of failures is disabled.
The maximum value is
20. |
FAILURE_INTERVAL |
No | 0 | The interval, in seconds, during which CAA applies the failure threshold. If the value is zero (0), tracking of failures is disabled. |
The following example creates a network resource profile:
# /usr/sbin/caa_profile -create network1 -t network -s "16.69.244.0" \ -d "Network1"
The contents of the profile in file
/var/cluster/caa/profile/network1.cap
created by the
preceding command are as follows:
NAME=network1 TYPE=network DESCRIPTION=Network1 FAILURE_INTERVAL=0 FAILURE_THRESHOLD=0 SUBNET=16.69.244.0
For more information on the network resource profile syntax, see
caa_profile(8)caa(4)
Through routing, all members in a cluster can indirectly access any network that is attached to any member. Nevertheless, an application may require the improved performance that comes by running on a member with direct connectivity to a network. For that reason, an application resource may define an optional or required dependency on a network resource. CAA optimizes the placement of that application resource based on the location of the network resource.
When you make a network resource an optional resource
(OPTIONAL_RESOURCES) for an application, the application
may start on a member that
is directly connected to the subnet, depending on the required
resources,
placement policy, and cluster state.
If the network adapter fails, the
application
may still access the subnet remotely through routing.
If you specify a network resource as a required resource
(REQUIRED_RESOURCES) and the network adapter fails, CAA
relocates
or stops the application.
If the network fails on all eligible hosting
members, CAA will stop the application.
2.2.4 Tape Resource Profiles
Table 2-3
describes the tape profile
attributes.
For each attribute, the table indicates whether the attribute is
required, its default value, and a description.
Table 2-3: Tape Profile Attributes
| Attributes | Required | Default | Description |
TYPE |
Yes | None | The type of the resource.
The type
tape
is for tape resources. |
NAME |
Yes | None | The name of the resource. The resource name is a string that contains a combination of letters a-z or A-Z, digits 0-9, or the underscore (_) or period (.). The resource name may not start with a period. |
DESCRIPTION |
No | None | A description of the resource. |
DEVICE_NAME |
Yes | None | The device name of the tape resource.
Use the full path to the device special file (for example,
/dev/tape/tape1). |
FAILURE_THRESHOLD |
No | 0 | The number of failures detected within
FAILURE_INTERVAL
before CAA marks the resource as
unavailable and no longer monitors it.
If an application's check script fails
this number of times, the application resource is stopped and set offline.
If the value
is zero (0), tracking of failures is disabled.
The maximum value is
20. |
FAILURE_INTERVAL |
No | 0 | The interval, in seconds, during which CAA applies the failure threshold. If the value is zero (0), tracking of failures is disabled. |
Through the device request dispatcher, all cluster members can indirectly access any tape device that is attached to any cluster member. Nevertheless, an application may require the improved performance that comes from running on a member with direct connectivity to the tape device. For that reason, an application resource may define an optional or required dependency on a tape resource. CAA optimizes the placement of that application based on the location of the tape resource.
The following example creates a tape resource profile. After a tape resource has been defined in a resource profile, an application resource profile can designate it as a required or optional resource.
# /usr/sbin/caa_profile -create tape1 -t tape -n /dev/tape/tape1 -d "Tape Drive"
The contents of the profile that was created in the file
/var/cluster/caa/profile/tape1.cap
by the preceding
command are as follows:
NAME=tape1 TYPE=tape DESCRIPTION=Tape Drive DEVICE_NAME=/dev/tape/tape1 FAILURE_INTERVAL=0 FAILURE_THRESHOLD=0
2.2.5 Media Changer Resource Profiles
Media changer devices are similar to tape devices, but have access to multiple tape cartidges.
Table 2-4
describes the media changer
profile attributes.
For each attribute, the table indicates whether the
attribute is required, its default value, and a description.
Table 2-4: Media Changer Attributes
| Attributes | Required | Default | Description |
TYPE |
Yes | None | The type of the resource.
The type
changer
is for media changer resources. |
NAME |
Yes | None | The name of the resource. The resource name is a string that contains a combination of letters a-z or A-Z, digits 0-9, or the underscore (_) or period (.). The resource name may not start with a period. |
DESCRIPTION |
No | None | A description of the resource. |
DEVICE_NAME |
Yes | None | The device name of the media changer
resource.
Use the full path to the device special file (for example,
/dev/changer/mc1). |
FAILURE_THRESHOLD |
No | 0 | The number of failures detected within
FAILURE_INTERVAL
before CAA marks the resource as
unavailable and no longer monitors it.
If an application's check script
fails this number
of times, the application resource is stopped and set offline.
If the
value is zero (0), tracking of failures is disabled.
The maximum value is
20. |
FAILURE_INTERVAL |
No | 0 | The interval, in seconds, during which CAA applies the failure threshold. If the value is zero (0), tracking of failures is disabled. |
Through the device request dispatcher, all cluster members can indirectly access any media changer that is attached to any member. Nevertheless, an application may require the improved performance that comes from running on a member with direct connectivity to the media changer. For that reason, an application resource may define an optional or required dependency on a media changer resource. CAA optimizes the placement of that application based on the location of the media changer resource.
The following example creates a media changer resource profile. After a media changer resource has been defined in a resource profile, an application resource profile can designate it as a dependency.
# /usr/sbin/caa_profile -create mchanger1 -t changer -n /dev/changer/mc1 \ -d "Media Changer Drive"
The contents of the profile that was created in the file
/var/cluster/caa/profile/mchanger1.cap
by the preceding
command are as follows:
NAME=mchanger1 TYPE=changer DESCRIPTION=Media Changer Drive DEVICE_NAME=/dev/changer/mc1 FAILURE_INTERVAL=0 FAILURE_THRESHOLD=0
Resource profiles can be checked for correct syntax before
registration.
If a profile does not pass validation it will not be allowed to be
registered.
The
caa_profile
command can be used as follows to
check that the profile has been created correctly:
# /usr/sbin/caa_profile -validate resource
If there are any problems with the resource, an appropriate
message telling you which attributes are incorrect will be displayed.
2.3 Writing Action Scripts
Action scripts are necessary for application resources to start, stop, and relocate an application that is managed and monitored by CAA.
You use action scripts to specify the following:
How to start an application.
CAA calls the
start
entry point of the action
script to start or restart the application resource.
The start entry point
executes
all commands that are necessary to start the application and must return
0 (zero) for success and a nonzero value for failure.
How to stop an application and what cleanup occurs before the application is failed over.
CAA calls the
stop
entry point of the action
script
to stop a running application resource.
It is not called when stopping
an application resource in state
UNKNOWN.
(See
caa_stop(8)
How to determine whether an application is still running.
CAA calls the
check
entry point of the action
script
to verify that an application resource is running.
The check entry point
executes
every
CHECK_INTERVAL
seconds and must return 0 (zero)
for success and a nonzero value for failure.
Action scripts are located by default in the clusterwide
/var/cluster/caa/script
directory.
The file names
of action scripts take the form
name.scr.
The easiest way to create an action script is to have the
caa_profile
command automatically create one for you
when you create
the resource profile.
Do this by using the
-B
option.
For example:
# caa_profile -create resource_name -t application -B application_path
Use the
-B
option in the
caa_profile
command to specify the full pathname of an application executable; for
example,
/usr/local/bin/httpd.
When you use the
-B
option,
the
caa_profile
command creates an action script
named
/var/cluster/caa/script/resource_name.scr.
To specify a different action script name, use the
-a
option.
Depending on the application, you might need to edit the action
script
to correctly set up the environment for the application.
For example,
for an X application like
xclock, you need to set the
DISPLAY
environment variable on the command line
in the action script
as appropriate for the current shell.
It might look something like:
DISPLAY=`hostname`:0
export DISPLAY
Because an action script is required for an application resource,
when
you use the
caa_profile -create
command to create an
application
resource profile, one of the following conditions must be true:
You must specify the
caa_profile
option
-B
application_executable_pathname,
so that an action script is automatically created.
You may also specify
the name
of the action script that is created with the
-a
option.
You must have already created an executable action script
in the default directory,
/var/cluster/caa/script/.
The
root of the script name must be the same as the name of the resource
you create.
For example, if the action script is named
/var/cluster/caa/script/up-app-1.scr, then the
resource name must
be
up-app-1.
Therefore, if you use the
caa_profile
command to create the resource profile,
the command line starts as follows:
# caa_profile -create up-app-1 -t application
You must have already created an executable action script,
and you must use the
caa_profile
option
-a
action_script_pathname
to inform CAA where to find the action script.
For example:
-a /usr/users/smith/caa/scripts/app.scr
Caution
For security reasons, make sure that action scripts are writable only by root.
2.3.1 Guidelines for Writing Application Resource Action Scripts
When writing an action script for an application resource, note the following:
CAA relies on the exit code from the action script to
set the application state to
ONLINE
or
OFFLINE.
Each entry point in the action script must return an exit code of 0 to
reflect success or a nonzero exit code to specify failure.
CAA sets the application state to
UNKNOWN
if an action script's
stop
entry point fails to exit
within
the number of seconds in the
SCRIPT_TIMEOUT
value,
or returns with a nonzero value.
This may happen during a start attempt, a
relocation, or a stop attempt.
Be sure that the action script
stop
entry point exits with a 0 value if the application is successfully
stopped or if it is not running.
When a daemon is started, it usually starts as a background process. For an application that does not put itself into the background immediately upon startup, start the application in the background by adding an ampersand (&) to the end of the line that starts the application. An application started in this way will always return success on a start attempt. This means that the default scripts will have no way of detecting failure due to a trivial reason, such as a misspelled command path. When using such commands, we recommend that you execute the commands used in the script interactively to rule out syntax and other trivial errors before using the script with CAA.
For any X-windows applications that you may be running under CAA, you must also consider the following:
For a graphical application that is served by the
cluster
and monitored by CAA, you must set the
DISPLAY
environment
variable of the client system in the action script.
For example:
export DISPLAY=everest:0.0 /usr/bin/my_application &
On the client system, add the default cluster alias to the list of allowed X server connections. For example:
everest#> xhost +my_cluster
CAA scripts generated by
caa_profile
or SysMan do not set the
PATH
environment
variable.
When the scripts are executed, the
PATH
is set to a
default
value of
/sbin:/usr/sbin:/usr/bin.
Therefore, you
must explicitly specify most path names that are used in scripts, or you
must modify
the resulting scripts to explicitly set the
PATH.
Action
scripts that were automatically generated with previous releases may
have a
PATH
that includes the current directory
(.).
Because this situation may be a potential security
issue, modify
these scripts to remove the current directory from the path.
The action script template is located in
/var/cluster/caa/template/template.scr.
It is the
basis for action scripts that are created by the
caa_profile
command, and it is a good example of
the elements of an action script.
The following action scripts for application resources can be used
as examples and are found in the
/var/cluster/caa/script
directory:
cluster_lockd.scr
dhcp.scr
named.scr
autofs.scr
The scripts shown in
Section 2.14
are also
good
examples of action scripts.
These example scripts and others can be
found
in the
/var/cluster/caa/examples
directory.
There
are examples of several applications that are commonly administered using
CAA.
The script
sysres_templ.scr
that is located in
this directory is an example script that contains extra system performance
related
code that can be used to examine the system load, swap space usage, and
disk space available.
If you incorporate these features in your scripts, set
the values for variables that are associated with these features
appropriately for your system.
2.3.3 Accessing Environment Variables
An action script can access a number of environment variables, allowing you to tailor your scripts to be responsive to the variables.
The variables that are accessible in to an action script executed in the CAA environment include:
Profile attributes
Reason codes
Locale information
User-defined attributes
The CAA defined resource profile attributes can be accessed as an
environment
variable in any action script by prefixing
_CAA_
to
the attribute name.
For example, the
AUTO_START
value is
obtained using
_CAA_AUTO_START
in the script.
Reason codes
describe the reason that an action script was executed.
The environment
variable
_CAA_REASON
can have one of the following
reason code values:
userAction script was invoked
due to a user-initiated command, such as
caa_start,
caa_stop, or
caa_relocate.
failureAction script was executed because of a failure condition. A typical condition that sets this value is a check script failure.
dependencyAction script was invoked as a dependency of another resource that has had a failure condition.
bootAction script was invoked as a result of an initial cluster boot (resource was running in a prior system invocation).
autostartResource is being
autostarted.
If the
AUTOSTART
profile attribute is
set to 1, autostart occurs at cluster boot time if the resource was
previously offline on the cluster before the last shutdown.
systemAction script was initiated by the system due to normal maintenance, for example, the check script initiates a relocation.
unknownInternally unknown state when the script was invoked. If this value occurs, record the state of the cluster and application and contact your support representative.
The locale of the environment where a CAA command invokes an
action script is available to the action script in the
_CAA_CLIENT_LOCALE
environment variable.
This
variable contains the following locale
information in a string value separated by spaces:
LC_ALL,
LC_CTYPE,
LC_MONETARY,
LC_NUMERIC,
LC_TIME,
LC_MESSAGES.
The action script can
use this information, if desired, to set the locale in the action script
environment.
See
setlocale(3)locale(1)
An action script might use a code snippet similar to something below to make use of reason codes:
if [ "$_CAA_REASON" = "user" ]; then
echo "Action invoked by User"
.
.
.
fi
2.3.4 Directing Output from Action Scripts
You can redirect
output from an action script so that it is displayed when
caa_start,
caa_stop, or
caa_relocate
are executed.
Each line of output can optionally have a prefix
consisting of the cluster member and resource name.
Default operation is for output not to be redirected.
To enable action script output redirection in CAA, you must set
the environment variable
_CAA_UI_FMT
for the environment
in which you are executing
caa_start,
caa_stop,
or
caa_relocate
to either
v
or
vs, such as:
# export _CAA_UI_FMT=v # caa_start db_2 ... nodex:db_2:output text ... nodex:db_2:output text ... nodex:db_2:output text ... nodex:db_2:output text ...
Use of the modifier
s
suppresses the
node:resource prefix to the output.
For example:
# export _CAA_UI_FMT=vs # caa_start db_2 output text ... output text ...
2.4 Creating User-Defined Attributes
The format of application resource profiles can be extended with user-defined attributes. These user-defined attributes can be accessed within the resource action script as environment variables and apply to all application resources.
A user-defined attribute first must be defined in the application
resource type definition file located at
/var/cluster/caa/template/application.tdf.
The
values that must be defined are as follows.
Defines the attribute for which a user can specify a value. This attribute translates to an environment variable accessible in all application resource action scripts.
typeDefines the type of values
that are allowed for this attribute.
Types include:
boolean,
string,
name_list,
name_string,
positive_integer,
internet_address,
file.
switchDefines the switch used with the
caa_profile
command to
specify a profile value.
defaultDefines the default value for this attribute, if it is not specified in the profile.
requiredDefines whether the switch must be specified in a profile or not.
A user-defined attribute can be specified on the command line of
caa_start,
caa_relocate, or
caa_stop
as well as in a profile.
The value
specified on the command line
overrides any value specified in a profile.
For more information, see
caa_start(8)caa_relocate(8)caa_stop(8)
Any line in a type definition file that begins with a
#
is considered a comment.
An example entry in the type definition file is as follows:
#!========================== attribute: USR_DEBUG type: boolean switch: -o d default: 0 required: no
Each resource must have a profile.
Each resource must be
registered
with CAA to be managed by CAA.
Use the
caa_register
command to register your resources.
For example, to register the
clock
application, enter the following command:
# /usr/sbin/caa_register clock
After a resource is registered, the information in the profile is
stored
in the binary CAA registry.
If the profile is modified, you must update
the database with
caa_register
-u.
See
caa_register(8)2.6 Starting Application Resources
To start an application that is registered with CAA, use the
caa_start
command.
The name of the application resource
may or may not be the same as the name of the application.
For example:
# /usr/sbin/caa_start clock
The following text is an example of the command output:
Attempting to start `clock` on member `polishham` Start of `clock` on member `polishham` succeeded.
The application is now running on the system named
polishham.
The command will wait up to the
SCRIPT_TIMEOUT
value to receive notification of success or failure from the action script
each time that the action script is called.
Application resources can be started and non-application resources
can be restarted if they have stopped due to exceeding their failure
threshold
values.
(See the
Cluster Administration
manual for more information on
restarting non-application resources.) You must register a resource
(caa_register) before you can start it.
Note
Always use
caa_startandcaa_stop, or the equivalent SysMan feature, to start and stop resources. Do not start or stop the applications manually at the command line or by executing the action scripts. Manual starts or stops outside of CAA will cause resource status to be incorrect.
If you try to start a resource that has required resources that
are
ONLINE
on another cluster member, the start
will fail.
All required
resources must either be
OFFLINE
or
ONLINE
on the member where the resource will be started.
If you use the command
caa_start
-f
resource_name
on a resource that has required
resources
that are
OFFLINE, the resource starts and all
required
resources that are not currently
ONLINE
start as
well.
Executing the
caa_start
command on an
application resource actually only sets the resource target value to
ONLINE.
The target value specifies which state CAA will attempt to set the
resource.
CAA attempts to change the state to match the target and attempts
to start the application by running the action script start entry point.
When an application is running, both the target state and current state are
ONLINE.
The
Cluster Administration
manual has a more detailed
description of how target and state fields describe resources.
Note
When attempting to start an application on a cluster member that undergoes a system crash,
caa_startcan give indeterminate results. In this scenario, the start section of the action script is executed but the cluster member crashes before notification of the start is displayed on the command line. Thecaa_startcommand returns a failure with the errorRemote start for [resource_name] failed on member [member_name].The application resource is actuallyONLINEand fails over to another member making the application appear as though it was started on the wrong member.If a cluster member fails while you are starting an application resource on that member, you should check the state of the resource on the cluster with
caa_statto determine the state of that resource.
See
caa_start(8)2.7 Relocating Application Resources
Use the
caa_relocate
command to relocate
application resources.
You cannot relocate
network,
tape,
or
changer
resources.
To relocate an
application
resource to an
available cluster member, or to a specified cluster member, use the
caa_relocate
command.
For example, to relocate the
clock
application to member
provolone, enter the following
command:
# /usr/sbin/caa_relocate clock -c provolone
The following text is an example of the command output:
Attempting to stop `clock` on member `polishham` Stop of `clock` on member `polishham` succeeded. Attempting to start `clock` on member `provolone` Start of `clock` on member `provolone` succeeded.
To relocate the
clock
application to another
member
using the placement policy that is defined in the application resource's
profile, enter the following command:
# /usr/sbin/caa_relocate clock
The following text is an example of the command output:
Attempting to stop `clock` on member `pepicelli` Stop of `clock` on member `pepicelli` succeeded. Attempting to start `clock` on member `polishham` Start of `clock` on member `polishham` succeeded.
The following text is an example of the command output if the application cannot be relocated successfully due to a script returning a nonzero value or a script timeout:
Attempting to stop `clock` on member `pepicelli` Stop of `clock` on member `pepicelli` succeeded. Attempting to start `clock` on member `provolone` Start of `clock` on member `provolone` failed. No more members to consider Attempting to restart `clock` on member `pepicelli` Could not relocate resource clock.
Each time that the action script is called, the
caa_relocate
command will wait up to the
SCRIPT_TIMEOUT
value to receive notification of success or failure from the action
script.
A relocate attempt will fail if:
The resource has required resources that are
ONLINE
Resources that require the specified resource are
ONLINE
If you use the
caa_relocate
-f
resource_name
command on a resource that has
required
resources that are
ONLINE, or has resources that
require
it that are
ONLINE, the resource is relocated and all
resources
that require it and are
ONLINE
are relocated.
All
resources
that are required by the specified resource are relocated or started
regardless of their state.
See
caa_relocate(8)2.8 Balancing Application Resources
Balancing application resources is the reevaluation of application resource placement based on the current state of the resources on the cluster and the rules of placement for the resources. Balancing applications can be done on a clusterwide basis, a member-wide basis, or with specified resources. Balancing decisions are made using the standard placement decision mechanism of CAA and are not based on any load considerations.
Use the
caa_balance
command only with
application resources.
You cannot balance network, tape, or changer
resources.
Balancing on a per cluster basis reevaluates all
ONLINE
application resources on a cluster and relocates each resource that is
not running on the cluster member chosen by the placement decision
mechanism, as discussed in
Section 2.2.2.2.
To balance all applications on a cluster, enter the following command:
# /usr/sbin/caa_balance -all
Assuming that applications
test
and
test2
are the only two applications that are
ONLINE
and are running on member
rye
with balanced placement
policies, the following text is displayed:
Attempting to stop `test` on member `rye` Stop of `test` on member `rye` succeeded. Attempting to start `test` on member `swiss` Start of `test` on member `swiss` succeeded. Resource test2 is already well placed test2 is placed optimally. No relocation is needed.
If more applications are
ONLINE
in the cluster,
the output will reflect any actions taken for each application
resource.
To reevaluate placement of the applications running on the cluster
member
rye, enter the following command:
# /usr/sbin/caa_balance -s rye
Assuming that applications
test
and
test2
are the only two applications that are
ONLINE
and are
running
on member
rye
with balanced placement policies, the
following text is displayed:
Attempting to stop `test` on member `rye` Stop of `test` on member `rye` succeeded. Attempting to start `test` on member `swiss` Start of `test` on member `swiss` succeeded. Resource test2 is already well placed test2 is placed optimally. No relocation is needed.
If more applications are
ONLINE
in the cluster
member, the output will reflect any actions taken for each application
resource.
To balance specified applications only, enter the following command:
# /usr/sbin/caa_balance test test2
Assuming that applications
test
and
test2
are running on member rye with balanced placement
policies, the following text is displayed:
Attempting to stop `test` on member `rye` Stop of `test` on member `rye` succeeded. Attempting to start `test` on member `swiss` Start of `test` on member `swiss` succeeded. Resource test2 is already well placed test2 is placed optimally. No relocation is needed.
The time value in the profile must be specified in the following
format:
t:day:hour:min, where
day
is the day of the week (0-6),
hour
is the hour of the day (0-23), and
min
is the minute of the hour (0-59) when the re-evaluation occurs.
An
asterisk
may be used as a wildcard to specify every day, or every hour.
An example where the application will be rebalanced every Sunday at 0300 hours is:
REBALANCE=t:0:3:0
An example where the application will be rebalanced every day at 0230 hours is:
REBALANCE=t:*:2:30
An example of how to use the
caa_profile
command to specify this would be:
# /usr/sbin/caa_profile -create testapp -t application -B /usr/bin/true -o bt=*:2:30
The resulting profile will look like the following:
NAME=testapp TYPE=application ACTION_SCRIPT=testapp.scr ACTIVE_PLACEMENT=0 AUTO_START=0 CHECK_INTERVAL=60 DESCRIPTION=testapp FAILOVER_DELAY=0 FAILURE_INTERVAL=0 FAILURE_THRESHOLD=0 HOSTING_MEMBERS= OPTIONAL_RESOURCES= PLACEMENT=balanced REBALANCE=t:*:2:30 REQUIRED_RESOURCES= RESTART_ATTEMPTS=1 SCRIPT_TIMEOUT=60
See
caa_balance(8)2.9 Stopping Application Resources
To stop applications that are running in a cluster environment,
use the
caa_stop
command.
Immediately after the
caa_stop
command is executed, the target is set to
OFFLINE.
Because CAA always attempts to match a resource's state to its target,
the CAA subsystem stops the application.
Only application resources can be
stopped.
Network, tape, and media changer resources cannot be stopped.
In the following example, the
clock
application
resource is stopped:
# /usr/sbin/caa_stop clock
The following text is an example of the command output:
Attempting to stop `clock` on member `polishham` Stop of `clock` on member `polishham` succeeded.
You cannot stop an application if it is a required resource for
another
ONLINE
application.
If you use the command
caa_stop
-f
resource_name
on a resource that has resources
that require it and are
ONLINE, the resource is stopped
and all resources that require it and are
ONLINE
are
stopped.
See
caa_stop(8)2.10 Unregistering Application Resources
To unregister an application resource, use the
caa_unregister
command.
You cannot unregister an application that is
ONLINE
or required by another resource.In the following example, the
clock
application is unregistered:
# /usr/sbin/caa_unregister clock
See
caa_unregister(8)2.11 Displaying CAA Status Information
To display status information on resources on cluster members, use
the
caa_stat
command.
The following example displays the status information for the
clock
resource:
# /usr/bin/caa_stat clock NAME=clock TYPE=application TARGET=ONLINE STATE=ONLINE on provolone
To view information on all resources, enter the following command:
# /usr/bin/caa_stat NAME=clock TYPE=application TARGET=ONLINE STATE=ONLINE on provolone NAME=dhcp TYPE=application TARGET=ONLINE STATE=ONLINE on polishham NAME=named TYPE=application TARGET=ONLINE STATE=ONLINE on polishham NAME=network1 TYPE=network TARGET=ONLINE on provolone TARGET=ONLINE on polishham STATE=ONLINE on provolone STATE=ONLINE on polishham
To view information on all resources in a tabular form, enter the following command:
# /usr/bin/caa_stat -t Name Type Target State Host ---------------------------------------------------------------- cluster_lockd application ONLINE ONLINE provolone dhcp application OFFLINE OFFLINE network1 network ONLINE ONLINE provolone network1 network ONLINE ONLINE polishham
To find out how many times a resource has been restarted or has failed within the resource failure interval, the maximum number of times that a resource can be restarted or fail, and the target state of the application, as well as normal status information, enter the following command:
# /usr/bin/caa_stat -v NAME=cluster_lockd TYPE=application RESTART_ATTEMPTS=30 RESTART_COUNT=0 FAILURE_THRESHOLD=0 FAILURE_COUNT=0 TARGET=ONLINE STATE=ONLINE on provolone NAME=dhcp TYPE=application RESTART_ATTEMPTS=1 RESTART_COUNT=0 FAILURE_THRESHOLD=3 FAILURE_COUNT=1 TARGET=ONLINE STATE=OFFLINE NAME=network1 TYPE=network FAILURE_THRESHOLD=0 FAILURE_COUNT=0 on polishham FAILURE_COUNT=0 on polishham TARGET=ONLINE on provolone TARGET=ONLINE on polishham STATE=ONLINE on provolone STATE=OFFLINE on polishham
To view verbose content in a tabular form, enter the following command:
# /usr/bin/caa_stat -v -t Name Type R/RA F/FT Target State Host ---------------------------------------------------------------------- cluster_lockd application 0/30 0/0 ONLINE ONLINE provolone dhcp application 0/1 0/0 OFFLINE OFFLINE named application 0/1 0/0 OFFLINE OFFLINE network1 network 0/5 ONLINE ONLINE network1 network 0/5 ONLINE ONLINE polishham
To view the profile information that is stored in the database, enter the following command:
# /usr/bin/caa_stat -p NAME=cluster_lockd TYPE=application ACTION_SCRIPT=cluster_lockd.scr ACTIVE_PLACEMENT=0 AUTO_START=1 CHECK_INTERVAL=5 DESCRIPTION=Cluster lockd/statd FAILOVER_DELAY=30 FAILURE_INTERVAL=60 FAILURE_THRESHOLD=1 REBALANCE= HOSTING_MEMBERS= OPTIONAL_RESOURCES= PLACEMENT=balanced REQUIRED_RESOURCES= RESTART_ATTEMPTS=2 SCRIPT_TIMEOUT=60
.
.
.
See the
Cluster Administration
manual and
caa_stat(1)2.12 Graphical User Interfaces
The following sections discuss how to use the SysMan and SysMan
Station graphical user interfaces (GUIs) to manage CAA.
2.12.1 Using SysMan Menu to Manage CAA
You can start the SysMan Menu from the command line with
/usr/sbin/sysman.
To access the
CAA tools, select the Cluster
Application
Availablility (CAA) Management task under the TruCluster Specific
branch.
.
.
.
+ TruCluster Specific |Cluster Application Availability (CAA) Management
To start only the Cluster Application Availability (CAA)
Management
task, use
/usr/sbin/sysman caa.
See the Tru64 UNIX System Administration manual for more information on accessing SysMan Menu.
Using the SysMan Menu you can:
Manage resource profiles
Monitor CAA resources
Register resources
Start resources
Relocate resources
Stop resources
Unregister resources
The CAA GUI provides graphical assistance for cluster
administration
based on event reports from the Event Manager (EVM) and CAA
daemon.
2.12.2 Using SysMan Station to Manage and Monitor CAA
SysMan Station gives users a comprehensive graphical view of their cluster. SysMan Station lets you view the current status of CAA resources on a whole cluster, and manage those resources. SysMan Station also contains the management tool SysMan Menu to manage individual CAA resources. See the Tru64 UNIX System Administration manual for further information on accessing the SysMan Station.
To access the CAA SysMan Menu tools in the SysMan Station, follow these steps:
Select one of the views under
Views,
for
example,
CAA_Applications_(active)
or
CAA_Applications_(all).
Select the cluster name under the
Views
window, for example,
CAA_Applications_(active) View
or
CAA_Applications_(all) View.
From the Tools menu, select SysMan Menu. The Cluster Application Availablility (CAA) Management task is located under the TruCluster Specific branch.
For more detailed descriptions of the SysMan Menu and
SysMan Station,
see the online help or the Tru64 UNIX
System Administration
manual.
2.13 CAA Tutorial
This CAA tutorial helps you with the basic instructions necessary to quickly make an application highly available using CAA. For in-depth details on specific commands, you must read all the necessary documentation that pertains to the CAA commands.
Preconditions (Section 2.13.1)
Miscellaneous Setup (Section 2.13.2)
Example of an action script for
dtcalc
(Section 2.13.3)
Step 1: Creating the application resource profile (Section 2.13.4)
Step 2: Validating the application resource profile (Section 2.13.5)
Step 3: Registering the application (Section 2.13.6)
Step 4: Starting the application (Section 2.13.7)
Step 5: Relocating the application (Section 2.13.8)
Step 6: Stopping the application (Section 2.13.9)
Step 7: Unregistering the application (Section 2.13.10)
In this tutorial the example cluster contains the members
provolone,
polishham, and
pepicelli.
Wherever you see these member names
in a command line, use one of your own cluster member names instead.
2.13.1 Preconditions
You must have root access to a two-member TruCluster Server cluster.
In this tutorial you use CAA to make the Tru64 UNIX application
dtcalc
highly available.
Make sure that the test application
/usr/dt/bin/dtcalc
exists.
An X-based application is used only for demonstrative purposes in
this
example.
The X-based application is used to provide immediate viewing of
the
results of starts, stops, and relocation.
You will most likely not find
a use for highly available applications of this sort.
2.13.2 Miscellaneous Setup
If you are making an application with a graphical interface
highly available
using CAA, make sure that you set your
DISPLAY
variable
correctly in the
ActionScript.scr
file.
Modify the
DISPLAY
variable, and copy the file
ActionScript.scr
into
the scripts directory
/var/cluster/caa/script.
Verify that the host on which you want to display the application is able to display X applications from the cluster. If you need to modify the access, execute a command similar to following command on the machine that is displaying the application:
# xhost + clustername
If you are not sure of the actual names of each member, look in
the
/etc/hosts
file on your system to get the names of each
member.
You also can use the
clu_get_info
command to get
information
on each cluster member, including the host names.
The following command is an example showing the results of the
clu_get_info
command:
# clu_get_info
Cluster information for cluster deli
Number of members configured in this cluster = 3
memberid for this member = 3
Quorum disk = dsk10h
Quorum disk votes = 1
Information on each cluster member
Cluster memberid = 1
Hostname = polishham.zk4.com
Cluster interconnect IP name = polishham=ics0
Member state = UP
Cluster memberid = 2
Hostname = provolone.zk4.com
Cluster interconnect IP name = provolone=ics0
Member state = UP
Cluster memberid = 3
Hostname = pepicelli.zk4.com
Cluster interconnect I name = pepicelli=ics0
Member state = UP
2.13.3 Example of an Action Script for dtcalc
The following example is an action script that you can use for the
dtcalc
tutorial, or you can use the more
complex action script
that
is created by the
caa_profile
command:
#!/usr/bin/ksh -p
#
# This action script will be used to launch dtcalc.
#
export DISPLAY=`hostname`:0
PATH=/sbin:/usr/sbin:/usr/bin
export PATH
CAATMPDIR=/tmp
CMDPATH=/usr/dt/bin
APPLICATION=${CMDPATH}/dtcalc
CMD=`basename $APPLICATION`
case $1 in
'start') [1] if [ -f $APPLICATION ]; then
$APPLICATION & exit 0 else
echo "Found exit1" >/dev/console exit 1
fi ;; 'stop') [2]
PIDLIST=`ps ax | grep $APPLICATION | grep -v 'caa_' \
| grep -v 'grep' | awk '{print $1}'`
if [ -n "$PIDLIST" ]; then
kill -9 $PIDLIST
exit 0
fi
exit 0
;;
'check') [3]
PIDLIST=`ps ax | grep $CMDPATH | grep -v 'grep' | awk '{print $1}'`
if [ -z "$PIDLIST" ]; then
PIDLIST=`ps ax | grep $CMD | grep -v 'grep'
| awk '{print $1}'`
fi
if [-n "$PIDLIST" ]; then
exit 0
else
echo "Error: CAA could not find $CMD." >/dev/console
exit 1
fi
;;
esac
The start entry point is executed when an application is started. [Return to example]
The stop entry point is executed when an application is stopped. [Return to example]
The check entry point is
executed
every
CHECK_INTERVAL
seconds.
[Return to example]
Create the resource profile for
dtcalc
with the
following
options to the
caa_profile
command:
# /usr/sbin/caa_profile -create dtcalc -t application -B /usr/dt/bin/dtcalc \ -d "dtcalc application" -p balanced
When you examine the
dtcalc.cap
file that is
located
in
/var/cluster/caa/profile/, you will
see
the following:
# cat dtcalc.cap NAME=dtcalc TYPE=application ACTION_SCRIPT=dtcalc.scr ACTIVE_PLACEMENT=0 AUTO_START=0 CHECK_INTERVAL=60 DESCRIPTION=dtcalc application FAILOVER_DELAY=0 FAILURE_INTERVAL=0 FAILURE_THRESHOLD=0 HOSTING_MEMBERS= OPTIONAL_RESOURCES= PLACEMENT=balanced REQUIRED_RESOURCES= RESTART_ATTEMPTS=1 SCRIPT_TIMEOUT=60
2.13.5 Step 2: Validating the Application Resource Profile
To validate the resource profile syntax, enter the following command:
# caa_profile -validate dtcalc
If there are syntax errors in the profile,
caa_profile
displays messages indicating that the profile did not pass
validation.
2.13.6 Step 3: Registering the Application
To register the application, enter the following command:
# /usr/sbin/caa_register dtcalc
If the profile cannot be registered, messages are displayed explaining why.
To verify that the application is registered, enter the following command:
# /usr/bin/caa_stat dtcalc NAME=dtcalc TYPE=application TARGET=OFFLINE STATE=OFFLINE
2.13.7 Step 4: Starting the Application
To start the application, enter the following command:
# /usr/bin/caa_start dtcalc
The following messages are displayed:
Attempting to start `dtcalc` on member `provolone` Start of `dtcalc` on member `provolone` succeeded.
You can execute the
/usr/bin/caa_stat
dtcalc
command to check that the
dtcalc
action
script start entry point executed successfully and
dtcalc
is started.
For example:
# /usr/bin/caa_stat dtcalc NAME=dtcalc TYPE=application TARGET=ONLINE STATE=ONLINE on provolone
If the
DISPLAY
variable is set correctly in the
script,
dtcalc
appears on your display.
2.13.8 Step 5: Relocating the Application
To relocate the application, enter the following command:
# /usr/bin/caa_relocate dtcalc -c polishham
Execute the command
/usr/bin/caa_stat
to verify that
dtcalcdtcalc
started successfully.
An
example
follows:
# /usr/bin/caa_stat dtcalc NAME=dtcalc TYPE=application TARGET=ONLINE STATE=ONLINE on polishham
The cluster member
is
listed in the
STATE
attribute.
2.13.9 Step 6: Stopping the Application
To stop the application, enter the following command:
# /usr/bin/caa_stop dtcalc
The following information is displayed:
Attempting to stop `dtcalc` on member `provolone` Stop of `dtcalc` on member `provolone` succeeded.
You can execute the
/usr/bin/caa_stat
command to verify that the stop entry point of the
dtcalcdtcalc
action script executed successfully and
dtcalc
is
stopped.
For example:
# /usr/bin/caa_stat dtcalc NAME=dtcalc TYPE=application TARGET=OFFLINE STATE=OFFLINE
2.13.10 Step 7: Unregistering the Application
To unregister the application, enter the following command:
# /usr/sbin/caa_unregister dtcalc
2.14 Example Applications Managed by CAA
The following sections contain examples of highly available
single-instance
applications that are managed by CAA.
2.14.1 OpenLDAP Directory Server
The OpenLDAP (Lightweight Directory Access Protocol) Directory Server is part of the Internet Express for Tru64 UNIX product suite, a collection of popular Internet software combined with administration tools developed by HP. (Internet Express ships with every HP Tru64 UNIX AlphaServer system, and is also available from the following URL: http://www.tru64unix.compaq.com/docs/pub_page/iass_docs.html.) The products in this suite are cluster-ready and can be configured to run with high availability in a cluster.
The LDAP Module for System Authentication allows user identification and authentication information stored in an LDAP server to be used for all applications, including the following:
Login authentication (rlogin,
ftp, and
telnet)
POP and IMAP authentication
Transparent LDAP database access for the
getpw*()
and
getgr*()
routines in the
libc
library
To create a highly available OpenLDAP Directory Server in a TruCluster Server environment, perform the following:
Using the Internet Express Installation graphical user interface (GUI), install the Internet Express kit. Select the Internet Express Administration Utility and the OpenLDAP subsets for installation.
The installation procedure creates a CAA resource profile in the
/var/cluster/caa/profile
directory for the OpenLDAP
application resource:
TYPE = application NAME = openldap DESCRIPTION = OpenLDAP Directory Server CHECK_INTERVAL = 60 FAILURE_THRESHOLD = 0 FAILURE_INTERVAL = 0 REQUIRED_RESOURCES = OPTIONAL_RESOURCES = HOSTING_MEMBERS = PLACEMENT = balanced RESTART_ATTEMPTS = 1 FAILOVER_DELAY = 0 AUTO_START = 0 ACTION_SCRIPT = openldap.scr
It also creates an action script for the resource in the
/var/cluster/caa/script
directory:
#!/sbin/sh
#
# Start/stop the OpenLDAP Directory Server.
#
OLPIDFILE=/data/openldap/var/openldap_slapd.pid
OPENLDAP_CAA=1
export OPENLDAP_CAA
case "$1" in
'start')
/sbin/init.d/openldap start
;;
'stop')
/sbin/init.d/openldap stop
;;
'check')
# return non-zero if the service is stopped
if [ -f "$OLPIDFILE" ]
then
MYPID=`cat $OLPIDFILE`
RUNNING=`/usr/bin/ps -e -p $MYPID -o command | grep slapd`
fi
if [ -z "$RUNNING" ]
then
exit 1
else
exit 0
fi
;;
*)
echo "usage: $0 {start|stop|check}"
;;
esac
The following
init.d
script starts and stops
the OpenLDAP service in a cluster by calling the appropriate CAA command:
#!/sbin/sh
#
# Start the OpenLDAP Directory Server daemon.
#
NAME="OpenLDAP Directory Server"
HOME=/usr/internet/openldap
OLPIDFILE=/data/openldap/var/openldap_slapd.pid
MYPID=
RUNNING=
if [ -x /usr/sbin/clu_get_info ] && /usr/sbin/clu_get_info -q
then
CLUSTER="YES"
fi
check_running()
{
if [ -f "$OLPIDFILE" ]
then
MYPID=`cat $OLPIDFILE`
RUNNING=`/usr/bin/ps -e -p $MYPID -o command | grep slapd`
fi
if [ ! -z "$RUNNING" ]
then
return 1
else
return 0
fi
}
case "$1" in
'start')
if [ "$CLUSTER" = "YES" -a "$OPENLDAP_CAA" != "1" ]
then
/usr/sbin/caa_start -q openldap
else
check_running
checkres=$?
if [ $checkres = 1 ]
then
echo "$NAME already running"
else
$HOME/libexec/slapd -f $HOME/etc/slapd.conf
fi
fi
;;
'stop')
if [ "$CLUSTER" = "YES" -a "$OPENLDAP_CAA" != "1" ]
then
exit 1
else
check_running
checkres=$?
if [ $checkres = 1 ]
then
kill -TERM $MYPID
fi
fi
;;
*)
echo "usage: $0 {start|stop}"
;;
esac
It also adds the following line in the
/etc/clua_services
file:
openldap 389/tcp in_single,out_alias
2.14.2 Creating a Single-Instance, Highly Available Apache HTTP Server Using CAA
To create a single-instance Apache HTTP server with failover capabilities, follow these steps:
Download the latest, standard Apache distribution from
the
www.apache.org
Web site to the cluster and
follow the site's instructions
for building and installing Apache in the
/usr/local/apache
directory.
Create a default CAA application resource profile and action script with the following command:
# caa_profile -create httpd -t application -B /usr/local/apache/bin/httpd
The default profile adopts a failover policy that causes the
httpd
service to fail over to another member when
the member on
which it is running leaves the cluster.
It also allows the
httpd
service to be placed on any active cluster
member.
You can edit
the profile to employ other failover and placement policies and resource
dependencies.
The default action script contains a start entry point that starts
the
httpd
service and a stop entry point that stops the
httpd
service.
Register the profile with CAA by entering the following command on one member:
# caa_register httpd
Start the
httpd
service through CAA
by
entering the following command on one member:
# caa_start httpd
2.14.3 Creating a Single-Instance Oracle8i Server Using CAA
To create a single-instance Oracle8i Version 8.1.7 database server with failover capabilities, follow these steps:
Install and configure Oracle8i 8.1.7 using the instructions in the Oracle8i documentation.
Oracle requires that certain kernel attributes be set to specific
values, that specific UNIX groups (dba,
oinstall) be created, and that special environment
variables be initialized.
Before proceeding to set up the CAA service for the Oracle8i
single server, you must decide how client applications will reach the
service.
You can use either the cluster alias feature of TruCluster Server
or use an interface (IP) alias.
If you choose to use a cluster alias,
create a new cluster alias for each cluster member that will be an
Oracle8i server because you can tune the routing and scheduling attributes
of each alias independently.
(For information on how to create a cluster
alias, see
cluamgr(8)
If you want to use a cluster alias, add the IP address and name of
each cluster alias to the
/etc/hosts
file.
Add the following line to the
/etc/clua_services
file to set up the properties of the port that the Oracle8i listener
uses:
listener 1521/tcp in_single
Setting the
in_single
attribute means that the cluster
alias subsystem will distribute connection requests directed to a cluster
alias to one member of the alias.
If that member becomes unavailable, the
cluster alias subsystem will select another member of that cluster alias to
receive all requests.
To reload service definitions, enter the following command on all members:
# cluamgr -f
If you choose to use an interface address as the target of
client requests to the Oracle8i service, add the IP address and name of
the cluster alias to the
/etc/hosts
file.
In the
listener.ora
and
tnsnames.ora
files, edit the
HOST
field so that it contains each cluster alias that
clients will use to reach the service.
For example:
. . . (ADDRESS = (PROTOCOL = TCP) (HOST = alias1) (PORT = 1521)) . . .
An example Oracle CAA script is located in
/var/cluster/caa/examples/DataBase/oracle.scr.
Copy the
script to
/var/cluster/caa/script/oracle.scr, and
edit it to meet your environment needs such as e-mail accounts, log file
destinations, alias preference, and so on.
Do not include any file system
references in the script.
Perform some initial testing of the scripts by first executing the start and stop entry points outside of CAA. For example:
# cd /var/cluster/caa/script # ./oracle.scr start
Create a CAA application resource profile using the SysMan Station or by entering the following command:
# caa_profile -create oracle -t application \ -d "ORACLE Single-Instance Service" -p restricted -h "member1 member2"
Make sure that your Oracle CAA resource profile looks like the example
profile in
/var/cluster/caa/examples/DataBase/oracle.cap.
Register the
oracle
profile with CAA using
the SysMan Station or by entering the following command on one member:
# caa_register oracle
Start the
oracle
service using the
SysMan Station or by entering the following command on one member:
# caa_start oracle
2.14.4 Creating a Single-Instance Informix Server Using CAA
To create a single-instance Informix server with failover capabilities, follow these steps:
Install and configure Informix using the instructions in the Informix documentation.
Informix requires that specific UNIX groups (dba,
informix) be created.
Before proceeding to set up the CAA service for the Informix
single server, you must decide how client applications will reach the
service.
You can use either the cluster alias feature of the TruCluster Server
product or use an interface (IP) alias.
If you choose to use a cluster alias,
create a new cluster alias for each cluster member that will be an
Informix server because you can tune the routing and scheduling attributes
of each alias independently.
(For information on how to create a cluster
alias, see
cluamgr(8)
If you want to use a cluster alias, add the IP address and name of each
cluster alias to the
/etc/hosts
file.
Add the following line to the
/etc/clua_services
file to set up the properties of the port that the Informix listener
uses:
informix 8888/tcp in_single
Setting the
in_single
attribute means that the cluster
alias subsystem will distribute connection requests directed to the cluster
alias to one member of the alias.
If that member becomes unavailable, the
cluster alias subsystem will select another member of that cluster alias
to receive all requests.
To reload service definitions, enter the following command on all members:
# cluamgr -f
If you choose to use an interface address as the target of
client requests to the Informix service, add the IP address and name of the
cluster alias to the
/etc/hosts
file.
An example Informix CAA script is located in
/var/cluster/caa/examples/DataBase/informix.scr.
Copy the
script to
/var/cluster/caa/script/informix.scr, and edit
it to meet your environment needs such as e-mail accounts, log file
destinations, alias preference, and so on.
Do not include any file system
references in the script.
Perform some initial testing of the scripts by first executing the start and stop entry points outside of CAA. For example:
# cd /var/cluster/caa/script # ./informix.scr start
Create a CAA application resource profile using the SysMan Station or by entering the following command:
# caa_profile -create informix -t application \ -d "INFORMIX Single-Instance Service" -p restricted -h "member1 member2"
Make sure that your Informix CAA resource profile looks like the
example profile in
/var/cluster/caa/examples/DataBase/informix.cap.
Register the
informix
profile with CAA using the
SysMan Station or by entering the following command on one member:
# caa_register informix
Start the
informix
service using
the SysMan Station or by entering the following command on one member:
# caa_start informix