This chapter describes general migration issues that are relevant to all
types of applications.
Table 4-1
lists each migration
issue and the types of applications that might encounter them, as well
as where to find more information.
Table 4-1: Application Migration Considerations
| Issues | Application Types Affected | For More Information |
| Clusterwide and member-specific files | Single-instanceMulti-instanceDistributed | Section 4.1 |
| Device naming | Single-instanceMulti-instanceDistributed | Section 4.2 |
| Interprocess communication | Multi-instanceDistributed | Section 4.3 |
| Synchronized access to shared data | Multi-instanceDistributed | Section 4.4 |
| Member-specific resources | Single-instance | Section 4.5 |
| Expanded process IDs (PIDs) | Multi-instanceDistributed | Section 4.6 |
| Distributed lock manager (DLM) parameters removed | Multi-instance Distributed | Section 4.7 |
| Licensing | Single-instanceMulti-instanceDistributed | Section 4.8 |
| Blocking layered products | Single-instanceMulti-instanceDistributed | Section 4.9 |
4.1 Clusterwide and Member-Specific Files
A cluster has two sets of configuration data:
Clusterwide data
Clusterwide data pertains to files and logs that can be shared by all
members of a cluster.
For example, when two systems are members of a
cluster, they share a common
/etc/passwd
file
that contains information about the authorized users for both systems.
Sharing configuration or management data makes file management easier. For example, Apache and Netscape configuration files can be shared, allowing you to manage the application from any node in the cluster.
Member-specific data
Member-specific data pertains to files that contain member-specific data. These files cannot be shared by all members of a cluster. Member-specific data may be configuration details that pertain to hardware found only on a specific system, such as a layered product driver for a specific printer connected to one cluster member.
Because the cluster file system (CFS) makes all files visible to and accessible by all cluster members, those applications that require clusterwide configuration data can easily write to a configuration file that all members can view. However, an application that must use and maintain member-specific configuration information needs to take some additional steps to avoid overwriting files.
To avoid overwriting files, consider using one of the following methods:
| Method | Advantage | Disadvantage |
| Single file | Easy to manage. | Application must be aware of how to access member-specific data in the single file. |
| Multiple files | Keeps configuration information in a set of clusterwide files. | Multiple copies of files need to be maintained. Application must be aware of how to access member-specific files. |
| Context-dependent symbolic links (CDSLs) | Keeps configuration information in member-specific areas. CDSLs are transparent to the application; they look like symbolic links. | Moving or renaming files will break symbolic links. Application must be aware of how to handle CDSLs. Using CDSLs makes it more difficult for an application to find out about other instances of that application in the cluster. |
You must decide which method best fits your application's needs.
The
following sections describe each approach.
4.1.1 Using a Single File
Using a single, uniquely named file keeps application configuration information in one clusterwide file as separate records for each node. The application reads and writes the correct record in the file. Managing a single file is easy because all data is in one central location.
As an example, in a cluster the
/etc/printcap
file
contains entries for specific printers.
The following parameter can be
specified to indicate which nodes in the cluster can run the spooler
for the print queue:
:on=nodename1,nodename2,nodename3,...:
If the first node is up, it will run the spooler.
If that node goes
down, the next node, if it is up, will run the spooler, and so on.
4.1.2 Using Multiple Files
Using uniquely named multiple files keeps configuration information in a
set of clusterwide files.
For example, each cluster member has its own
member-specific
gated
configuration file in
/etc.
Instead of using a context-dependent symbolic
link (CDSL) to reference member-specific files through a common
file name, the naming convention for these files takes advantage of
member IDs to create a unique name for each member's file.
For example:
# ls -l /etc/gated.conf.member* -rw-r--r-- 1 root system 466 Jun 21 17:37 /etc/gated.conf.member1 -rw-r--r-- 1 root system 466 Jun 21 17:37 /etc/gated.conf.member2 -rw-r--r-- 1 root system 466 Jun 21 13:28 /etc/gated.conf.member3
This method requires more work to manage because multiple copies
of files need to be maintained.
For example, if the member ID of a
cluster member changes, you must find and rename all member-specific
files belonging to that member.
Also, if the application is unaware of
how to access member-specific files, you must configure it to do so.
4.1.3 Using CDSLs
Tru64 UNIX Version 5.0 introduced a special form of symbolic link, called a context-dependent symbolic link (CDSL), that TruCluster Server uses to point to the correct file for each member. CDSLs are useful when running multiple instances of an application on different cluster members on different sets of data.
Using a CDSL keeps configuration information in member-specific areas. However, the data can be referenced through the CDSL. Each member reads the common file name, but is transparently linked to its copy of the configuration file. CDSLs are an alternative to maintaining member-specific configuration information when an application cannot be easily changed to use multiple files.
The following example shows the CDSL structure for the file
/etc/rc.config:
/etc/rc.config -> ../cluster/members/{memb}/etc/rc.config
For example, where a cluster member has a member ID of 3, the pathname
/cluster/members/{memb}/etc/rc.config
resolves to
/cluster/members/member3/etc/rc.config.
Tru64 UNIX provides the
mkcdsl
command, which lets
system administrators create CDSLs and update a CDSL inventory file.
For more information on this command, see the TruCluster Server
Cluster Administration
manual and
mkcdsl(8)hier(5)ln(1)symlink(2)4.2 Device Naming
Tru64 UNIX Version 5.0 introduced a new device-naming convention that consists of a descriptive name for the device and an instance number. These two elements form the basename of the device. For example:
| Location in /dev | Device Name | Instance | Basename |
./disk |
dsk |
0 | dsk0 |
./disk |
cdrom |
1 | cdrom1 |
./tape |
tape |
0 | tape0 |
Moving a disk from one physical connection to another does not change the device name for the disk. For a detailed discussion of this device-naming model, see the Tru64 UNIX System Administration manual.
Although Tru64 UNIX recognizes both the old-style
(rz) and new-style (dsk) device
names, TruCluster Server recognizes only the new-style device names.
Applications that depend on old-style device names or the
/dev
directory structure must be modified to use
the newer device-naming convention.
You can use the
hwmgr
utility, a generic utility for
managing hardware, to help map device names to their bus, target, and
LUN position after installing Tru64 UNIX Version 5.1B.
For example,
enter the following command to view devices:
# hwmgr -view devices
HWID: Device Name Mfg Model Location
--------------------------------------------------------------------
45: /dev/disk/floppy0c 3.5in floppy fdi0-unit-0
54: /dev/disk/cdrom0c DEC RRD47 (C) DEC bus-0-targ-5-lun-0
55: /dev/disk/dsk0c COMPAQ BB00911CA0 bus-1-targ-0-lun-0
56: /dev/disk/dsk1c COMPAQ BB00911CA0 bus-1-targ-1-lun-0
57: /dev/disk/dsk2c DEC HSG80 IDENTIFIER=7
.
.
.
Use the following command to view devices clusterwide:
# hwmgr -view devices -cluster HWID: Device Name Mfg Model Hostname Location ----------------------------------------------------------------------- 45: /dev/disk/floppy0c 3.5in floppy swiss fdi0-unit-0 54: /dev/disk/cdrom0c DEC RRD47 (C) DEC swiss bus-0-targ-5-lun-0 55: /dev/disk/dsk0c COMPAQ BB00911CA0 swiss bus-1-targ-0-lun-0 56: /dev/disk/dsk1c COMPAQ BB00911CA0 swiss bus-1-targ-1-lun-0 57: /dev/disk/dsk2c DEC HSG80 swiss IDENTIFIER=7 . . .
For more information on using this command, see
hwmgr(8)
When modifying applications to use the new-style device-naming convention, look for the following:
Disks that are included in Advanced File System (AdvFS) domains
Raw disk devices
Disks that are encapsulated in Logical Storage Manager (LSM) volumes or that are included in disk groups
Disk names in scripts
Disk names in data files (Oracle OPS and Informix XPS)
SCSI bus renumbering
Note
If you previously renumbered SCSI buses in your ASE, carefully verify the mapping from physical device to bus number during an upgrade to TruCluster Server. See the Cluster Installation manual for more information.
4.3 Interprocess Communication
The following mechanisms for clusterwide interprocess communication (IPC) are supported:
TCP/IP connections using sockets
Buffered I/O or memory-mapped files
UNIX file locks
Distributed lock manager (DLM) locks
Clusterwide kill signal
Memory Channel application programming interface (API) library (memory windows, low level locks, and signals)
The following mechanisms are not supported for clusterwide IPC:
UNIX domain sockets
Named pipes (FIFO special files)
Signals
System V IPC (messages, shared memory, and semaphores)
If an application uses any of these IPC methods, it must be
restricted to running as a single-instance application.
4.4 Synchronized Access to Shared Data
Multiple instances of an application running within a cluster must synchronize with each other for most of the same reasons that multiprocess and multithreaded applications synchronize on a standalone system. However, memory-based synchronization mechanisms (such as critical sections, mutexes, simple locks, and complex locks) work only on the local system and not clusterwide. Shared file data must be synchronized, or files must be used to synchronize the execution of instances across the cluster.
Because the cluster file system (CFS) is fully POSIX compliant, an
application can use
flock()
system calls to
synchronize access to shared files among instances.
You can also use
the distributed lock manager (DLM) API library functions for more
sophisticated locking capabilities (such as additional lock modes,
lock conversions, and deadlock detection).
Because the DLM API library
is supplied only in the TruCluster Server product, make sure that code that
uses its functions and that is meant also to run on nonclustered systems
precedes any DLM function calls with a call to
clu_is_member().
The
clu_is_member()
function verifies that the system is in fact a cluster member.
For more
information about this command, see
clu_is_member(3)4.5 Member-Specific Resources
If multiple instances of an application are started simultaneously on
more than one cluster member, some instances of the application may not
work properly because they depend on resources that are available only
from a specific member, such as large CPU cycles or a large amount of
physical memory.
This may restrict the application to running as a single
instance in a cluster.
Changing these characteristics in an application
may be enough to allow it to run as multiple instances in a cluster, or,
if more than one member has the resources, only run the application on
those members.
4.6 Expanded PIDs
In TruCluster Server, process identifiers (PIDs) are expanded to a full
32-bit value.
The data type
PID_MAX
is increased to
2147483647 (0x7fffffff); therefore, any applications that test for
PID <= PID_MAX
must be recompiled.
To ensure that PIDs are unique across a cluster, PIDs for each cluster member are based on the member ID and are allocated from a range of numbers unique to that member. The formula for available PIDs in a cluster is:
PID = (memberid * (2**19)) + 2
Typically, the first two values are reserved for the
kernel idle
process and
/sbin/init.
For example, PIDs 524,288 and 524,289 are assigned to
kernel idle
and
init,
respectively, on a cluster member whose
memberid
is 1.
Use PIDs to uniquely identify log and temporary files.
If an application
does store a PID in a file, make sure that that file is member-specific.
4.7 DLM Parameters Removed
Because the distributed lock manager (DLM) persistent resources,
resource groups, and transaction IDs are enabled by default in
TruCluster Available Server and TruCluster Production Server Version 1.6
and TruCluster Server Version 5.0 and later, the
dlm_disable_rd
and
dlm_disable_grptx
attributes are unneeded and have
been removed from the DLM kernel subsystem.
4.8 Licensing
This section discusses licensing constraints and issues.
4.8.1 TruCluster Server Clusterwide Licensing Not Supported
TruCluster Server Version 5.1B does not support clusterwide licensing.
Each time that you add an additional member to the cluster, you must register
all required application licenses on that member for applications that
may run on that member.
4.8.2 Layered Product Licensing and Network Adapter Failover
The Redundant Array of Independent Network Adapters (NetRAIN) and the Network Interface Failure Finder (NIFF) provide mechanisms for facilitating network failover and replace the monitored network interface method that was employed in the TruCluster Available Server and Production Server products.
NetRAIN provides transparent network adapter failover for multiple adapter configurations. NetRAIN monitors the status of its network interfaces with NIFF, which detects and reports possible network failures. You can use NIFF to generate events when network devices, including a composite NetRAIN device, fail. You can monitor these events and take appropriate actions when a failure occurs. For more information about NetRAIN and NIFF, see the Tru64 UNIX Network Administration: Connections manual.
In a cluster, an application can fail over and restart itself on another member. If it performs a license check when restarting, it may fail because it was looking for a particular member's IP address or its adapter's media access control (MAC) address.
Licensing schemes that use a network adapter's MAC
address to uniquely identify a machine can be affected by how
NetRAIN changes the MAC address.
All network drivers support the
SIOCRPHYSADDR
ioctl
that fetches MAC addresses from
the interface.
This
ioctl
returns two addresses
in an array:
Default hardware address the permanent address that is taken from the small PROM that each LAN adapter contains.
Current physical address the address that the network interface responds to on the wire.
For licensing schemes that are based on MAC addresses, use the default hardware
address that is returned by SIOCRPHYSADDR
ioctl; do not
use the current physical address because NetRAIN modifies this address for its
own use.
See the reference page for your network adapter (for example,
tu(7)ioctl.
4.9 Blocking Layered Products
Check whether an application that you want to migrate is a blocking
layered product.
A blocking layered product is a product that prevents
the
installupdate
command from completing during an
update installation of TruCluster Server Version 5.1B.
Blocking layered
products must be removed from the cluster before starting a rolling
upgrade that will include running the
installupdate
command.
Unless a layered product's documentation specifically states that you can install a newer version of the product on the first rolled member, and that the layered product knows what actions to take in a mixed-version cluster, we strongly recommend that you do not install either a new layered product or a new version of a currently installed layered product during a rolling upgrade.
The TruCluster Server Cluster Installation manual lists layered products that are known to break an update installation on TruCluster Server Version 5.1B.