11 Troubleshooting Clusters

This chapter presents the following topics:

Suggestions for resolving problems on a cluster (Section 11.1)

Hints for configuring and managing a cluster (Section 11.2)

11.1 Resolving Problems

This section describes solutions to problems that can arise during the day-to-day operation of a cluster.

11.1.1 Booting Systems Without a License

You can boot a system that does not have a TruCluster Server license. The system joins the cluster and boots to multiuser mode, but only root can log in (with a maximum of two users). The cluster application availability (CAA) daemon, caad, is not started. The system displays a license error message reminding you to load the license. This policy enforces license checks while making it possible to boot, license, and repair a system during an emergency.

11.1.2 Shutdown Leaves Members Running

A cluster shutdown (shutdown -c) can leave one or more members running. In this situation, you must complete the cluster shutdown by shutting down all members.

Imagine a three-member cluster where each member has one vote and no quorum disk is configured. During cluster shutdown, quorum is lost when the second-to-last member goes down. If quorum checking is on, the last member running suspends all operations and cluster shutdown never completes.

To avoid an impasse in situations like this, quorum checking is disabled at the start of the cluster shutdown process. If a member fails to shut down during cluster shutdown, it might appear to be a normally functioning cluster member, but it is not, because quorum checking is disabled. You must manually complete the shutdown process.

The shutdown procedure depends on the state of the systems that are still running:

If the systems are hung, not servicing commands from the console, then halt the systems and generate a crash dump.

If the systems are not hung, then use the /sbin/halt command to halt the system.

11.1.3 Environmental Monitoring and Cluster Shutdowns

The envconfig command, documented in envconfig(8), includes a ENVMON_SHUTDOWN_SCRIPT variable that specifies the path of a user-defined script that you want the envmond daemon to execute when a shutdown condition is encountered. If you use this script in a cluster, you need to make sure that the cluster does not lose quorum as a result of executing the script. Specifically, your ENVMON_SHUTDOWN_SCRIPT script must determine whether shutting down a given member will result in the loss of quorum and, if so, shut down the remaining cluster members.

For example, assume that you have a cluster consisting of five members. Also assume that all of the cluster members are located in the same computer room and that the air conditioning has failed. While the temperature passes the maximum allowed system temperature, the cluster members individually discover that the temperature is too hot and invoke your ENVMON_SHUTDOWN_SCRIPT script. When the third member leaves the cluster, the cluster loses quorum (quorum votes = round_down((cluster_expected_votes+2)/2)) and the two remaining members stop all processing. Consequently, these remaining members continue to run in the overheated lab.

If the loss of a cluster member will cause the loss of quorum, shut down the remaining cluster members.

Although your ENVMON_SHUTDOWN_SCRIPT script can shut down the cluster with shutdown -c in all cases, we do not recommend this method.

The following sample script determines whether shutting down the current member will result in the loss of quorum.

#!/usr/bin/ksh -p
#
  typeset currentVotes=0
  typeset quorumVotes=0
  typeset nodeVotes=0
 
  clu_get_info -q
  is_cluster=$?
 
        if [ "$is_cluster" = 0 ]  
        then
 
              # The following code checks whether it is safe to shut down
              # another member. It is considered safe if
              # the cluster would not lose quorum if a member shuts down.
              # If it's not safe, shut down the entire cluster.
 
              currentVotes=$(sysconfig -q cnx current_votes | \
               sed -n 's/current_votes.* //p')
 
              quorumVotes=$(sysconfig -q cnx quorum_votes | \
               sed -n 's/quorum_votes.* //p') 
 
              nodeVotes=$(sysconfig -q cnx node_votes | \
               sed -n 's/node_votes.* //p')  
 
           # Determine if this node is a voting member
 
           if  [ "$nodeVotes" -gt 0 ] 
           then              
 
               # It's a voting member, see if we'll lose quorum.
 
               if  [[ $((currentVotes-1)) -ge  ${quorumVotes} ]]
               then
                    echo "shutdown -h now..."
                    else 
                    echo "shutdown -c now..."  
               fi
           else
               echo "This member has no vote...shutdown -h now..."
           fi
 
        else
                # not in a cluster...nothing to do
                exit 0
        fi

11.1.4 Dealing with CFS Errors at Boot

During system boot when the clusterwide root (/) is mounted for the first time, CFS can generate the following warning message:

"WARNING:cfs_read_advfs_quorum_data: cnx_disk_read failed with error-number

Usually error-number is the EIO value.

This message is accompanied by the following message:

"WARNING: Magic number on ADVFS portion of CNX partition on quorum disk \
is not valid"

These messages indicate that the booting member is having problems accessing data on the CNX partition of the quorum disk, which contains the device information for the cluster_root domain. This can occur if the booting member does not have access to the quorum disk, either because the cluster is deliberately configured this way or because of a path failure. In the former case, the messages can be considered informational. In the latter case, you need to adddress the cause of the path failure.

The messages can mean that there are problems with the quorum disk itself. If hardware errors are also being reported for the quorum disk, then replace it. For information on replacing a quorum disk, see Section 4.5.1.

For a description of error numbers, see errno(5). For a description of EIO, see errno(2).

11.1.5 Backing Up and Repairing a Member's Boot Disk

A member's boot disk contains three partitions. Table 11-1 presents some details about these partitions.

Note

Although you are not prohibited from adding filesets to a member's boot partition, we do not recommend it. If a member leaves the cluster, all filesets mounted from that member's boot partition are force-unmounted and cannot be relocated.

Table 11-1: Member Boot Disk Partitions

Partition	Content
`a`	Advanced File System (AdvFS) boot partition, member root file system
`b`	Swap partition (all space between the `a` and `h` partitions)
`h`	CNX binary partition AdvFS and Logical Storage Manager (LSM) store information critical to their functioning on the `h` partition. This information includes whether the disk is a member or quorum disk, and the name of the device where the cluster `root` file system is located.

If a member's boot disk is damaged or becomes unavailable, you need the h partition information to restore the member to the cluster. The clu_bdmgr command enables you to configure a member boot disk, and to save and restore data on a member boot disk.

The clu_bdmgr command can do the following tasks:

Configure a new member boot disk.

Back up the information on the h partition of a member boot disk.

Repair an h partition with data from a file or with data from the h partition of the boot disk of a currently available member.

For specifics on the command, see clu_bdmgr(8).

Whenever a member boots, clu_bdmgr automatically saves a copy of the h partition of that member's boot disk. The data is saved in /cluster/members/memberID/boot_partition/etc/clu_bdmgr.conf.

As a rule, the h partitions on all member boot disks contain the same data. There are two exceptions to this rule:

The contents of the h partition somehow become corrupted.

A member's boot disk is on the member's private bus and the member is down when an update occurs on the cluster that affects the contents of boot disk h partitions. Because the down member's disk is on a private bus, the h partition cannot be updated.
We recommend that a member's boot disk be on a shared SCSI bus. In addition to ensuring that the h partition is up-to-date, this configuration enables you to diagnose and fix problems with the boot disk even though the member cannot be booted.

If a member's boot disk is damaged, you can use clu_bdmgr to repair or replace it. Even if the cluster is not up, as long as you can boot the clusterized kernel on at least one cluster member, you can use the clu_bdmgr command.

For a description of how to add a new disk to the cluster, see Section 9.2.3.

To repair a member's boot disk, you must first have backed up the boot partition. One method is to allocate disk space in the shared /var file system for a dump image of each member's boot partition.

To save a dump image for member3's boot partition in the member-specific file /var/cluster/members/member3/boot_part_vdump, enter the following command:

# vdump -0Df /var/cluster/members/member3/boot_part_vdump \
/cluster/members/member3/boot_partition

11.1.5.1 Example of Recovering a Member's Boot Disk

The following sequence of steps shows how to use the file saved by vdump to replace a boot disk. The sequence makes the following assumptions:

The boot disk for member3 is dsk3.

You have already added the new member boot disk to the cluster and the name of this replacement disk is dsk5.
The process of adding the disk includes using the hwmgr -scan comp -cat scsi_bus command so that all members recognize the new disk. A description of how to add a disk to the cluster appears in Section 9.2.3.

Note

A member's boot disk should always be on a bus shared by all cluster members. This arrangement permits you to make repairs to any member's boot disk as long as you can boot at least one cluster member.

Use clu_get_info to determine whether member3 is down:

# clu_get_info -m 3
 
Cluster memberid = 3
Hostname = member3.zk3.dec.com
Cluster interconnect IP name = member3-mc0
Member state = DOWN

Select a new disk (in this example, dsk5) as the replacement boot disk for member3. Because the boot disk for member3 is dsk3, you are instructed to edit member3's /etc/sysconfigtab so that dsk5 is used as the new boot disk for member3.

To configure dsk5 as the boot disk for member3, enter the following command:

# /usr/sbin/clu_bdmgr  -c  dsk5 3
 
The new member's disk, dsk5, is not the same name as the original disk
configured for domain root3_domain.  If you continue the following
changes will be required in member3's/etc/sysconfigtab file:
        vm:
        swapdevice=/dev/disk/dsk5b
        clubase:
        cluster_seqdisk_major=19
        cluster_seqdisk_minor=175

Mount member3's root domain (now on dsk5) so you can edit member3's /etc/sysconfigtab and restore the boot partitions:
```
# mount root3_domain#root /mnt
 
```

Restore the boot partition:

# vrestore -xf /var/cluster/members/member3/boot_part_vdump -D /mnt

Edit member3's /etc/sysconfigtab
```
# cd /mnt/etc
# cp sysconfigtab sysconfigtab-bu
 
```
As indicated in the output from the clu_bdmgr command, change the values of the swapdevice attribute in the vm stanza and the cluster_seqdisk_major and cluster_seqdisk_minor attributes in the clubase stanza:
```
        swapdevice=/dev/disk/dsk5b
        clubase:
        cluster_seqdisk_major=19
        cluster_seqdisk_minor=175
 
```

Restore the h partition CNX information:
```
# /usr/sbin/clu_bdmgr -h  dsk5
```
The h partition information is copied from the cluster member where you run the clu_bdmgr command to the h partition on dsk5.
If the entire cluster is down, you need to boot one of the members from the clusterized kernel. After you have a single-member cluster running, you can restore the CNX h partition information to member3's new boot disk, dsk5, from /mnt/etc/clu_bdmgr.conf. Enter the following command:
```
# /usr/sbin/clu_bdmgr -h  dsk5 /mnt/etc/clu_bdmgr.conf
```

Unmount the root domain for member3:
```
# umount root3_domain#root /mnt
```

Boot member3 into the cluster.

Optionally, use the consvar -s bootdef_dev disk_name command on member3 to set the bootdef_dev variable to the new disk.

11.1.6 Specifying cluster_root at Boot Time

At boot time you can specify the device that the cluster uses for mounting cluster_root, the cluster root file system. Use this feature only for disaster recovery, when you need to boot with a new cluster root.

The cluster file system (CFS) kernel subsystem supports six attributes for designating the major and minor numbers of up to three cluster_root devices. Because the cluster_root domain that is being used for disaster recovery may consist of multiple volumes, you can specify one, two, or three cluster_root devices:

cluster_root_dev1_maj
The device major number of one cluster_root device.

cluster_root_dev1_min
The device minor number of the same cluster_root device.

cluster_root_dev2_maj
The device major number of a second cluster_root device.

cluster_root_dev2_min
The device minor number of the second cluster_root device.

cluster_root_dev3_maj
The device major number of a third cluster_root device.

cluster_root_dev3_min
The device minor number of the third cluster_root device.

To use these attributes, shut down the cluster and boot one member interactively, specifying the appropriate cluster_root_dev major and minor numbers. When the member boots, the CNX partition (h partition) of the member's boot disk is updated with the location of the cluster_root devices. If the cluster has a quorum disk, its CNX partition is also updated. While other nodes boot into the cluster, their member boot disk information is also updated.

For example, assume that you want to use a cluster_root that is a two-volume file system that comprises dsk6b and dsk8g. Assume that the major/minor numbers of dsk6b are 19/227, and the major/minor numbers of dsk8g are 19/221. You boot the cluster as follows:

Boot one member interactively:

>>> boot -fl "ia"
 (boot dkb200.2.0.7.0 -flags ia)
 block 0 of dkb200.2.0.7.0 is a valid boot block
 reading 18 blocks from dkb200.2.0.7.0
 bootstrap code read in
 base = 200000, image_start = 0, image_bytes = 2400
 initializing HWRPB at 2000
 initializing page table at fff0000
 initializing machine state
 setting affinity to the primary CPU
 jumping to bootstrap code
 
 
.
.
.
Enter kernel_name [option_1 ... option_n]
 Press Return to boot default kernel
 'vmunix':vmunix cfs:cluster_root_dev1_maj=19 \
 cfs:cluster_root_dev1_min=227 cfs:cluster_root_dev2_maj=19 \
 cfs:cluster_root_dev2_min=221[Return]

Boot the other cluster members.

For information about using these attributes to recover the cluster root file system, see Section 11.1.7 and Section 11.1.8.

11.1.7 Recovering the Cluster Root File System to a Disk Known to the Cluster

Use the procedure described in this section when all of the following are true:

The cluster root file system is corrupted or unavailable.

You have a current backup of the cluster root file system. The backup must reflect the disk storage environment known to the cluster root file system at the time it failed. For example, you must not have deleted (via hwmgr del), redirected (via hwmgr redirect), or refreshed (via hwmgr refresh comp), any devices since you made the backup.
If the backup of the cluster root file system does not reflect the current disk storage environment, this procedure causes a panic. If this panic occurs, it is not possible to recover the cluster root file system. You must use the clu_create command to recreate the cluster.

A disk (or disks) on a shared bus that is accessible to all cluster members is available to restore the file system to, and this disk was part of the cluster configuration before the problems with the root file system occurred.

This procedure is based on the following assumptions:

The vdump command was used to back up the cluster root (cluster_root) file system.
If you used a different backup tool, use the appropriate tool to restore the file system.

At least one member has access to:
- A bootable base Tru64 UNIX disk.
  If a bootable base disk is not available, install Tru64 UNIX on a disk that is local to the cluster member. It must be the same version of Tru64 UNIX that was installed on the cluster.
- The member boot disk for this member (dsk2a in this example)
- The device with the backup of cluster root

All members of the cluster have been halted.

To restore the cluster root, do the following:

Boot the system with the base Tru64 UNIX disk.
For the purposes of this procedure, we assume this system to be member 1.

If this system's name for the device that will be the new cluster root is different than the name that the cluster had for that device, use the dsfmgr -m command to change the device name so that it matches the cluster's name for the device.
For example, if the cluster's name for the device that will be the new cluster root is dsk6b and the system's name for it is dsk4b, rename the device with the following command:
```
# dsfmgr -m dsk4 dsk6
```

If necessary, partition the disk so that the partition sizes and file system types will be appropriate after the disk is the cluster root.

Create a new domain for the new cluster root:
```
# mkfdmn /dev/disk/dsk6d cluster_root
```

Make a root fileset in the domain:
```
# mkfset cluster_root root
```

This restoration procedure allows for cluster_root to have up to three volumes. After restoration is complete, you can add additional volumes to the cluster root. For this example, we add only one volume, dsk6b:
```
# addvol /dev/disk/dsk6b cluster_root
 
```

Mount the domain that will become the new cluster root:
```
# mount cluster_root#root /mnt
```

Restore cluster root from the backup media. (If you used a backup tool other than vdump, use the appropriate restore tool in place of vrestore.)
```
# vrestore -xf /dev/tape/tape0 -D /mnt
```

Change /etc/fdmns/cluster_root in the newly restored file system so that it references the new device:
```
# cd /mnt/etc/fdmns/cluster_root
# rm *
# ln -s /dev/disk/dsk6b
 
```

Use the file command to get the major/minor numbers of the new cluster_root device. (See Section 11.1.6 for additional information on the use of the cluster_root device major/minor numbers.) Make note of these major/minor numbers.
For example:
```
# file /dev/disk/dsk6b
/dev/disk/dsk6b:        block special (19/221)
 
```

Shut down the system and reboot interactively, specifying the device major and minor numbers of the new cluster root. Section 11.1.6 describes how to specify the cluster root at boot time.

Note

You will probably need to adjust expected votes to boot the member, as described in Section 4.10.2.
```
>>> boot -fl "ia"
 (boot dkb200.2.0.7.0 -flags ia)
 block 0 of dkb200.2.0.7.0 is a valid boot block
 reading 18 blocks from dkb200.2.0.7.0
 bootstrap code read in
 base = 200000, image_start = 0, image_bytes = 2400
 initializing HWRPB at 2000
 initializing page table at fff0000
 initializing machine state
 setting affinity to the primary CPU
 jumping to bootstrap code
 
 
.
.
.
Enter kernel_name [option_1 ... option_n]
 Press Return to boot default kernel
 'vmunix':vmunix cfs:cluster_root_dev1_maj=19 \
 cfs:cluster_root_dev1_min=221[Return]
 
```
When the member boots, the CNX partition (h partition) of the member's boot disk is updated with the location of the cluster_root devices. If the cluster has a quorum disk, its CNX partition is also updated. While other nodes boot into the cluster, their member boot disk information is also updated.

Boot the other cluster members.

11.1.8 Recovering the Cluster Root File System to a New Disk

The process of recovering cluster_root to a disk that was previously unknown to the cluster is complicated. Before you attempt it, try to find a disk that was already installed on the cluster to serve as the new cluster root disk, and follow the procedure in Section 11.1.7.

Use the recovery procedure described here when:

The cluster root file system is corrupted or unavailable.

You have a current backup of the cluster root file system. The backup must reflect the disk storage environment known to the cluster root file system at the time it failed. For example, you must not have deleted (via hwmgr del), redirected (via hwmgr redirect), or refreshed (via hwmgr refresh comp), any devices since you made the backup.
If the backup of the cluster root file system does not reflect the current disk storage environment, this procedure causes a panic. If this panic occurs, it is not possible to recover the cluster root file system. You must use the clu_create command to recreate the cluster.

No disk is available to restore to that is on a shared bus that is accessible to all cluster members and was part of the cluster configuration before the problems with the root file system occurred.

This procedure is based on the following assumptions:

The cluster_usr and cluster_var file systems are not on the same disk as the cluster_root file system. The procedure describes recovering only the cluster_root file system.

The vdump command was used to back up the cluster root (cluster_root) file system.
If you used a different backup tool, use the appropriate tool to restore the file system.

At least one member has access to the following items:
- A bootable base Tru64 UNIX disk with the same version of Tru64 UNIX that was installed on the cluster.
  If a bootable base operating system disk is not available, install Tru64 UNIX on a disk that is local to the cluster member. Make sure that it is the same version of Tru64 UNIX that was installed on the cluster.
- The member boot disk for this member (dsk2a in this example)
- The device with the cluster root backup
- The disk or disks for the new cluster root

All members of the cluster have been halted.

To restore the cluster root, do the following:

Boot the system with the Tru64 UNIX disk.
For the purposes of this procedure, we assume this system to be member 1.

If necessary, partition the new disk so that the partition sizes and file system types will be appropriate after the disk is the cluster root.

Create a new domain for the new cluster root:
```
# mkfdmn /dev/disk/dsk5b new_root
```
As described in the TruCluster Server Cluster Installation manual, the cluster_root file system is often put on a b partition. In this case, /dev/disk/dsk5b is used for example purposes.

Make a root fileset in the domain:
```
# mkfset new_root root
```

This restoration procedure allows for new_root to have up to three volumes. After restoration is complete, you can add additional volumes to the cluster root. For this example, we add one volume, dsk8e:
```
# addvol /dev/disk/dsk8e new_root
```

Mount the domain that will become the new cluster root:
```
# mount new_root#root /mnt
```

Restore cluster root from the backup media. (If you used a backup tool other than vdump, use the appropriate restore tool in place of vrestore.)
```
# vrestore -xf /dev/tape/tape0 -D /mnt
```

Copy the restored cluster databases to the /etc directory of the Tru64 UNIX system:
```
# cd /mnt/etc
# cp dec_unid_db dec_hwc_cdb dfsc.dat /etc
```

Copy the restored databases from the member-specific area of the current member to the /etc directory of the Tru64 UNIX system:
```
# cd /mnt/cluster/members/member1/etc
# cp dfsl.dat /etc
```

If one does not already exist, create a domain for the member boot disk:

# cd /etc/fdmns
# ls
# mkdir root1_domain
# cd root1_domain
# ln -s /dev/disk/dsk2a

Mount the member boot partition:

# cd /
# umount /mnt
# mount root1_domain#root /mnt

Copy the databases from the member boot partition to the /etc directory of the Tru64 UNIX system:
```
# cd /mnt/etc
# cp dec_devsw_db dec_hw_db dec_hwc_ldb dec_scsi_db /etc
```

Unmount the member boot disk:
```
# cd /
# umount /mnt
```

Update the database .bak backup files:

# cd /etc
# for f in dec_*db ; do cp $f $f.bak ; done

Reboot the system into single-user mode using the same Tru64 UNIX disk so that it will use the databases that you copied to /etc.
If the backup of the cluster root file system did not reflect the current disk storage environment, the procedure panics at this point. If this happens, it is not possible to recover the cluster root file system. You must run clu_create to recreate the cluster.

After booting to single-user mode, scan the devices on the bus:
```
# hwmgr -scan scsi
```

Remount the root as writable:
```
# /sbin/mountroot
```

Verify and update the device database:
```
# dsfmgr -v -F
```

Use hwmgr to learn the current device naming.
```
# hwmgr -view devices
```

If necessary, update the local domains to reflect the device naming (especially usr_domain, new_root, and root1_domain).
Do this by going to the appropriate /etc/fdmns directory, deleting the existing link, and creating new links to the current device names. (You learned the current device names in the previous step.) For example:
```
# cd /etc/fdmns/root_domain
# rm *
# ln -s /dev/disk/dsk1a
# cd /etc/fdmns/usr_domain
# rm *
# ln -s /dev/disk/dsk1g
# cd /etc/fdmns/root1_domain
# rm *
# ln -s /dev/disk/dsk2a
# cd /etc/fdmns/new_root
# rm *
# ln -s /dev/disk/dsk5b
# ln -s /dev/disk/dsk8e
 
```

Run the bcheckrc command to mount local file systems, particularly /usr:
```
#  bcheckrc
```

Copy the updated cluster database files onto the cluster root:

# mount new_root#root /mnt
# cd /etc
# cp dec_unid_db* dec_hwc_cdb* dfsc.dat /mnt/etc
# cp dfsl.dat /mnt/cluster/members/member1/etc

Update the cluster_root domain on the new cluster root:

# rm /mnt/etc/fdmns/cluster_root/*
# cd /etc/fdmns/new_root
# tar cf - * | (cd /mnt/etc/fdmns/cluster_root && tar xf -)

Copy the updated cluster database files to the member boot disk:

# umount /mnt
# mount root1_domain#root /mnt
# cd /etc
# cp dec_devsw_db* dec_hw_db* dec_hwc_ldb* dec_scsi_db* /mnt/etc

Use the file command to get the major/minor numbers of the cluster_root devices. (See Section 11.1.6 for additional information on the use of the cluster_root device major/minor numbers.) Write down these major/minor numbers for use in the next step.
For example:
```
# file /dev/disk/dsk5b
/dev/disk/dsk5b:        block special (19/227)
# file /dev/disk/dsk8e
/dev/disk/dsk8e:        block special (19/221)
```

Halt the system and reboot interactively, specifying the device major and minor numbers of the new cluster root. Section 11.1.6 describes how to specify the cluster root at boot time.

Note

You will probably need to adjust expected votes to boot the member, as described in Section 4.10.2.

>>> boot -fl "ia"
 (boot dkb200.2.0.7.0 -flags ia)
 block 0 of dkb200.2.0.7.0 is a valid boot block
 reading 18 blocks from dkb200.2.0.7.0
 bootstrap code read in
 base = 200000, image_start = 0, image_bytes = 2400
 initializing HWRPB at 2000
 initializing page table at fff0000
 initializing machine state
 setting affinity to the primary CPU
 jumping to bootstrap code
 
 
.
.
.
Enter kernel_name [option_1 ... option_n]
 Press Return to boot default kernel
 'vmunix':vmunix cfs:cluster_root_dev1_maj=19 \
 cfs:cluster_root_dev1_min=227 cfs:cluster_root_dev2_maj=19 \
 cfs:cluster_root_dev1_min=221[Return]

Boot the other cluster members.
If during boot you encounter errors with device files, run the command dsfmgr -v -F.

11.1.9 Dealing with AdvFS Problems

This section describes some problems that can arise when you use AdvFS.

11.1.9.1 Responding to Warning Messages from addvol or rmvol

Under some circumstances, using addvol or rmvol on the cluster_root domain can cause the following warning message:

"WARNING:cfs_write_advfs_root_data: cnx_disk_write failed for quorum disk with error-number."

Usually error-number is the EIO value.

This message indicates that the member where the addvol or rmvol executed cannot write to the CNX partition of the quorum disk. The CNX partition contains device information for the cluster_root domain.

The warning can occur if the member does not have access to the quorum disk, either because the cluster is deliberately configured this way or because of a path failure. In the former case, the message can be considered informational. In the latter case, you need to adddress the cause of the path failure.

The message can mean that there are problems with the quorum disk itself. If hardware errors are also being reported for the quorum disk, then replace the disk. For information on replacing a quorum disk, see Section 4.5.1.

For a description of error numbers, see errno(5). For a description of EIO, see errno(2).

11.1.9.2 Resolving AdvFS Domain Panics Due to Loss of Device Connectivity

AdvFS can domain panic if one or more storage elements containing a domain or fileset become unavailable. The most likely cause of this problem is when a cluster member is attached to private storage that is used in an AdvFS domain, and that member leaves the cluster. A second possible cause is when a storage device has hardware trouble that causes it to become unavailable. In either case, because no cluster member has a path to the storage, the storage is unavailable and the domain panics.

Your first indication of a domain panic is likely to be I/O errors from the device, or panic messages written to the system console. Because the domain might be served by a cluster member that is still up, CFS commands such as cfsmgr -e might return a status of OK and not immediately reflect the problem condition.

# ls -l /mnt/mytst
/mnt/mytst: I/O error
 
# cfsmgr -e
Domain or filesystem name = mytest_dmn#mytst
Mounted On = /mnt/mytst
Server Name = deli
Server Status : OK

If you are able to restore connectivity to the device and return it to service, use the cfsmgr command to relocate the affected filesets in the domain to the same member that served them before the panic (or to another member) and then continue using the domain.

# cfsmgr -a SERVER=provolone -d mytest_dmn
 
# cfsmgr -e
Domain or filesystem name = mytest_dmn#mytests
Mounted On = /mnt/mytst
Server Name = provolone
Server Status : OK

11.1.9.3 Forcibly Unmounting an AdvFS File System or Domain

If you are not able to restore connectivity to the device and return it to service, TruCluster Server Version 5.1B includes the cfsmgr -u and cfsmgr -U commands.

You can use cfsmgr -u to forcibly unmount an AdvFS file system or domain that is not being served by any cluster member. The unmount is not performed if the file system or domain is being served.

You can use the cfsmgr -U command to forcibly unmount an entire AdvFS domain that is being served by any cluster member.

The version of the command that you invoke depends on how the cluster file system (CFS) currently views the domain:

If the cfsmgr -e command indicates that the domain or file system is not served, use the cfsmgr -u command to forcibly unmount the domain or file system:
```
# cfsmgr -e
Domain or filesystem name = mytest_dmn#mytests
Mounted On = /mnt/mytst
Server Name = deli
Server Status : Not Served
 
# cfsmgr -u /mnt/mytst
 
```
If nested mounts on the file system are being unmounted, the forced unmount is not performed. Similarly, if nested mounts are on any fileset when the entire domain is being forcibly unmounted, and the nested mount is not in the same domain, the forced unmount is not performed.

If the cfsmgr -e command indicates that the domain or file system is being served, use the cfsmgr -U command to forcibly unmount the entire domain:
```
# cfsmgr -e
Domain or filesystem name = mytest_dmn#mytests
Mounted On = /mnt/mytst
Server Name = deli
Server Status : OK
 
# cfsmgr -U /mnt/mytst
 
```
Unlike the -u flag, -U applies only to an entire domain. You cannot unmount just one file system of a served domain; if you specify a file system, the entire domain is unmounted. If nested mounts are on any file system in the domain being unmounted, and the nested mount is not in the same domain, the forced unmount is not performed.

For detailed information on the cfsmgr command, see cfsmgr(8).

11.1.9.4 Avoiding Domain Panics

The AdvFS graphical user interface (GUI) agent, advfsd, periodically scans the system disks. If a metadata write error occurs, or if corruption is detected in a single AdvFS file domain, the advfsd daemon initiates a domain panic (rather than a system panic) on the file domain. This isolates the failed domain and allows a system to continue to serve all other domains.

From the viewpoint of the advfsd daemon running on a member of a cluster, any disk that contains an AdvFS domain and becomes inaccessible can trigger a domain panic. In normal circumstances, this is expected behavior. To diagnose such a panic, follow the instructions in the chapter on troubleshooting in the Tru64 UNIX AdvFS Administration manual. However, if a cluster member receives a domain panic because another member's private disk becomes unavailable (for instance, when that member goes down), the domain panic is an unnecessary distraction.

To avoid this type of domain panic, edit each member's /usr/var/advfs/daemon/disks.ignore file so that it lists the names of disks on other members' private storage that contain AdvFS domains. This will stop the advfsd daemon on the local member from scanning these devices.

To identify private devices, use the sms command to invoke the graphical interface for the SysMan Station, and then select Hardware from the Views menu.

11.1.10 Accessing Boot Partitions on Down Systems

When a member leaves the cluster, either cleanly through a shutdown or in an unplanned fashion, such as a panic, that member's boot partition is unmounted. If the boot partition is on the shared bus, any other member can gain access to the boot partition by mounting it.

Suppose the system provolone is down and you want to edit provolone's /etc/sysconfigtab. You can enter the following commands:

# mkdir /mnt
# mount root2_domain#root /mnt

Before rebooting provolone, you must unmount root2_domain#root. For example:

# umount root2_domain#root

11.1.11 Booting a Member While Its Boot Disk Is Already Mounted

Whenever the number of expected quorum votes or the quorum disk device is changed, the /etc/sysconfigtab file for each member is updated. In the case where a cluster member is down, the cluster utilities that affect quorum (clu_add_member, clu_quorum, clu_delete_member, and so forth) mount the down member's boot disk and make the update. If the down member tries to boot while its boot disk is mounted, it receives the following panic:

cfs_ mountroot: CFS server already exists for this nodes boot partition

The cluster utilities correctly unmount the down member's boot disk after they complete the update.

In general, attempting to boot a member while another member has the first member's boot disk mounted causes the panic. For example, if you mount a down member's boot disk in order to make repairs, you generate the panic if you forget to unmount the boot disk before booting the repaired member.

11.1.12 Generating Crash Dumps

If a serious cluster problem occurs, crash dumps might be needed from all cluster members. To get crash dumps from functioning members, use the dumpsys command, which saves a snapshot of the system memory to a dump file.

To generate the crash dumps, log in to each running cluster member and run dumpsys. By default, dumpsys writes the dump to the member-specific directory /var/adm/crash.

For more information, see dumpsys(8).

You can also use the Configure Dump and Create Dump Snapshot features of SysMan Menu to configure crash dumps. The Configure Dump feature configures the generic system configuration variables associated with the savecore command. The Create Dump Snapshot feature configures the dumpsys command, which dumps a snapshot of memory manually into a file when you cannot halt the system to generate a normal crash dump.

11.1.12.1 Generating a Crash Dump When a Member Is Hung

If a cluster member is hung you cannot use the dumpsys command to generate a crash dump. In this case, follow these steps to generate a crash dump:

Use the dumpsys command on each live member to copy a snapshot of memory to a dump file. By default, dumpsys writes the dump to /var/adm/crash, which is a CDSL to /cluster/members/{memb}/adm/crash.

Use the clu_quorum command to make sure the cluster will not lose quorum when you halt the hung member, as described in Section 5.5.

Crash the hung member. To do this, manually halt the member and run crash at the console prompt.

Boot the member. savecore ( savecore(8)) runs on boot and captures the dump in /var/adm/crash.

11.1.13 Fixing Network Problems

This section describes potential networking problems in a cluster and solutions to resolve them.

Symptoms

Cannot ping cluster

Cannot rlogin to or from cluster

Cannot telnet from cluster

Things to Verify

Make sure that all cluster members are running gated.
Additionally, make sure that/etc/rc.config contains the following lines:
GATED="yes" export GATED

Make sure that /etc/rc.config contains the following lines:
ROUTER="yes" export ROUTER

Make sure that /etc/hosts has correct entries for the cluster default alias and cluster members.
At a minimum, ensure that /etc/hosts has the following:
- IP address and name for the cluster alias
  Note
  
  Do not use a cluster alias address as a broadcast address or a multicast address, nor let it reside in the subnet used by the cluster interconnect. In addition, although cluster members can use and advertise IPv6 addresess, they are not supported by the cluster alias subsystem. Therefore, you cannot assign IPv6 addresses to cluster aliases.
  Although you can assign a cluster alias an IP address that resides in one of the private address spaces defined in RFC 1918, you must do the following in order for the alias subsystem to advertise a route to the alias address:
```
# rcmgr -c set CLUAMGR_ROUTE_ARGS resvok
# cluamgr -r resvok
 
```
  Repeat the cluamgr command for each cluster member. For more information on the resvok flag, see cluamgr(8).
- IP address and name for each cluster member
- IP address and interface name associated with each member's cluster interconnect interface
For example:
127.0.0.1 localhost 16.140.102.238 trees.tyron.com trees 16.140.102.176 birch.tyron.com birch 16.140.102.237 oak.tyron.com oak 16.140.102.3 hickory.tyron.com hickory 10.0.0.1 birch-mc0 10.0.0.2 oak-mc0 10.0.0.3 hickory-mc0

Make sure aliasd is running on every cluster member.

Make sure that all cluster members are members of the default alias (joined and enabled). You can verify this by entering the following command:
```
#  cluamgr -s default_alias
 
```
To make one member a member of the default alias, run the cluamgr command on that member. For example:
```
#  cluamgr -a alias=default_alias,join
 
```
Then on each member run the following command:
```
# cluamgr -r start
 
```

Make sure a member is routing for the default alias. You can verify this by running the following command on each member:
```
# arp default_alias
 
```
The result should include the phrase permanent published. One member should have a permanent published route for the cluster default alias.

Make sure that the IP addresses of the cluster aliases are not already in use by another system.
If you accidentally configure the cluster alias daemon, aliasd, with an alias IP address that is already used by another system, the cluster can experience connectivity problems: some machines might be able to reach the cluster alias and others might fail. Those that cannot reach the alias might appear to get connected to a completely different machine.
An examination of the arp caches on systems that are outside the cluster might reveal that the affected alias IP address maps to two or more different hardware addresses.
If the cluster is configured to log messages of severity err, then look at the system console and kernel log files for the following message:
local IP address nnn.nnn.nnn.nnn in use by hardware address xx-xx-xx-xx-xx

After you have made sure that the entries in /etc/rc.config and /etc/hosts are correct and have fixed any other problems, try stopping and then restarting the gateway and inet daemons. Do this by entering the following commands on each cluster member:

# /sbin/init.d/gateway stop
# /sbin/init.d/gateway start

11.1.14 Running routed in a Cluster

Releases prior to TruCluster Server Version 5.1B required that you run the gated routing daemon in a cluster. You could not use routed, ogated, or set up static routing and use no routing daemon at all. Starting with Version 5.1B you can use gated, routed, or static routing. You cannot use ogated. The default configuration uses gated. Section 3.14 describes routing options.

11.2 Hints for Managing Clusters

This section contains hints and suggestions for configuring and managing clusters.

11.2.1 Moving /tmp

By default, member-specific /tmp areas are in the same file system, but they can be moved to separate file systems. In some cases, you may want to move each member's /tmp area to a disk local to the member in order to reduce traffic on the shared SCSI bus.

If you want a cluster member to have its own /tmp directory on a private bus, you can create an AdvFS domain on a disk on the bus local to that cluster member and add an entry in /etc/fstab for that domain with a mountpoint of /tmp.

For example, the following /etc/fstab entries are for the /tmp directories for two cluster members, tcr58 and tcr59, with member IDs of 58 and 59, respectively.

tcr58_tmp#tmp /cluster/members/member58/tmp advfs rw 0 0 tcr59_tmp#tmp /cluster/members/member59/tmp advfs rw 0 0

The tcr58_tmp domain is on a bus that only member tcr58 has connectivity to. The tcr59_tmp domain is on a disk that only member tcr59 has connectivity to.

When each member boots, it attempts to mount all file systems in /etc/fstab but it can mount only those domains that are not already mounted and for which a path to the device exists. In this example, only tcr58 can mount tcr58_tmp#tmp and only tcr59 can mount tcr59_tmp#tmp.

You could have put the following in /etc/fstab:

tcr58_tmp#tmp /tmp advfs rw 0 0 tcr59_tmp#tmp /tmp advfs rw 0 0

Because /tmp is a context-dependent symbolic link (CDSL), it will be resolved to /cluster/members/membern/tmp. However, putting the full pathname in /etc/fstab is clearer and less likely to cause confusion.

11.2.2 Running the MC_CABLE Console Command

All members must be shut down to the console prompt before you run the MC_CABLE Memory Channel diagnostic command on any member. This is normal operation.

Running the MC_CABLE command from the console of a down cluster member when other members are up crashes the cluster.

11.2.3 Korn Shell Does Not Record True Path to Member-Specific Directories

The Korn shell (ksh) remembers the path that you used to get to a directory and returns that pathname when you enter a pwd command. This is true even if you are in some other location because of a symbolic link somewhere in the path. Because TruCluster Server uses CDSLs to maintain member-specific directories in a clusterwide namespace, the Korn shell does not return the true path when the working directory is a CDSL.

If you depend on the shell interpreting symbolic links when returning a pathname, use a shell other than the Korn shell. For example:

# ksh
# ls -l /var/adm/syslog
lrwxrwxrwx   1 root system  36 Nov 11 16:17 /var/adm/syslog
->../cluster/members/{memb}/adm/syslog
# cd /var/adm/syslog
# pwd
/var/adm/syslog
# sh
# pwd
/var/cluster/members/member1/adm/syslog