6    Reinstalling Cluster Members

This chapter explains how to perform the following tasks:

6.1    Re-creating a Single-Member Cluster

This procedure deletes the members of the current cluster, boots the base operating system, and re-runs the clu_create command to create a new cluster, which may or may not have the same configuration as the current cluster. (The procedure in Section 6.3 uses configuration files to create a cluster with the same configuration as the current cluster.)

To re-create a single-member cluster, follow these steps:

  1. Determine which disk is the Tru64 UNIX boot disk:

    1. Look in the /etc/fdmns directory to find the special file name for the Tru64 UNIX boot disk (usually in the root_domain directory). For example:

      ls /etc/fdmns/root_domain
      /dev/disk/dsk0a
       
      

    2. If the disk in on a private bus, use SysMan Station to determine which system has the disk.

    3. Flash the light on the disk. For example:

      hwmgr -flash light -dsf /dev/disk/dsk0
       
      

      Section 3.7.2 provides information you can use to map a disk's dsk device special file name to its physical location.

  2. On a cluster member that has direct access to the Tru64 UNIX disk, delete all other cluster members. (See the Cluster Administration manual for information on deleting cluster members. If you want to save the configuration files for these members before you delete them, see Section 6.3.) This system is now the only member of the cluster.

    Caution

    We strongly recommend deleting all cluster members before re-creating a cluster. Otherwise, there is a chance that someone might attempt to boot an old member boot disk into the new cluster.

  3. Determine, and remember, the device special file name for this member's cluster boot disk. For example, if this system is member 1, enter:

    ls /etc/fdmns/root1_domain
    /dev/disk/dsk10a
     
    

  4. Halt this system:

    shutdown -h now
     
    

  5. From the system console, boot the Tru64 UNIX operating system to multi-user mode:

    >>> boot UNIX_disk
     
    

  6. Because this system was not deleted from the cluster, its cluster member boot disk is still capable of being booted by any system that has access to it. To prevent a system from inadvertently booting this disk, we recommend that you zero (clear) the disk label. For example, to zero the disk label for dsk10, enter:

    disklabel -z dsk10
     
    

  7. Run the clu_create command to create a single-member cluster:

    /usr/sbin/clu_create
     
    

    See Chapter 4 for information on running the clu_create command.

6.2    Reinstalling Individual Cluster Members

To reinstall a member, halt that member and perform the following procedure on another member:

  1. To remove the member from the cluster, run the clu_delete_member command. See the Cluster Administration manual for information on removing a member from a cluster.

  2. To add the member back into the cluster, run the clu_add_member command. See Chapter 5 for information on running the clu_add_member command. See clu_add_member(8) and Section 6.3 for information on using configuration files.

6.3    Using Installation Configuration Files

When the original TruCluster Server cluster was created, clu_create and clu_add_member wrote configuration files to the /cluster/admin directory. The files are named .membern.cfg, where n is the member ID of the cluster member. Each time these commands are run successfully, they append the current configuration information to the respective member configuration files. To learn about configuration files and the restrictions on their use, see clu_create(8) and clu_add_member(8).

The following example shows a configuration file created by clu_create for a cluster that has installed the Worldwide Language Support (WLS) subsets and uses LAN hardware for its cluster interconnect. Comment lines are wrapped for readability.

# clu_create saved configuration values:
# date: Tue May 15 15:47:14 EDT 2001 hostname \
# pepicelli.zk3.dec.com
# Previously saved value in this file have been \
# converted to comment lines
clu_alias_ip=16.140.112.209
clu_boot_dev=dsk10
clu_i18n_dev=dsk14
clu_ics_dev=ics0
clu_ics_host=pepicelli-ics0
clu_ics_ip=10.0.0.1
clu_mem_votes=1
clu_memid=1
clu_name=deli.zk3.dec.com
clu_nr_dev=nr0
clu_phys_devs=ee0,ee1
clu_quorum_dev=dsk7
clu_quorum_votes=1
clu_root_dev=dsk1b
clu_tcp_host=member1-icstcp0
clu_tcp_ip=10.1.0.1
clu_usr_dev=dsk2c
clu_var_dev=dsk3c
 

These installation configuration files contain variable=value pairs that provide a way to automate the tasks performed by clu_create and clu_add_member. When you run one of these commands and specify the -c option and the name of a configuration file, the command uses the configuration file as input (replacing the need to manually enter information).

If clu_create or clu_add_member cannot find a required name/value pair when reading a configuration file, the command prompts for the required information and then returns to reading the configuration file. For example, you can delete the first member of the cluster, and then run clu_add_member -c using the configuration file created by clu_create (usually /cluster/admin/.member1.cfg). The clu_add_member command prompts you for any information it needs that is not in the configuration file created by clu_create.

Note

Configuration files are generated by programs and read by programs. In general, do not manually edit configuration files.

Configuration files make it easy to re-create an existing cluster. However, the information in the configuration files must be accurate; for example, host names, IP addresses, and disk special file names. Because disk devices are named in order of discovery, using configuration files to re-create a cluster implies that you run clu_create -c member_conf_file on the same system it was run on previously and that the storage configuration has not changed. In addition, you must add the members in the same order used when creating the original cluster. For each member, run clu_add_member -c member_conf_file on the same member it was run on previously. (The date: comment line in a configuration file contains the date that clu_create or clu_add_member was run and the name of the host on which it was run.)

If your existing cluster meets the following requirements, you can automate the re-creation of the cluster by saving the current clu_create and clu_add_member configuration files:

Note

The remainder of this section describes how to re-create the entire cluster. You can also use configuration files to re-add a member to a cluster.

The following procedure recreates a TruCluster Server cluster using the configuration files from the current cluster.

  1. Perform a full backup of the current cluster.

  2. Determine which disk is the Tru64 UNIX boot disk. It is often a private disk on the system that became the first cluster member:

    1. Look in the /etc/fdmns directory to find the special file name for the Tru64 UNIX boot disk (usually in the root_domain directory). For example:

      ls /etc/fdmns/root_domain
      /dev/disk/dsk0a
       
      

    2. Use SysMan Station to determine which system has the disk.

    3. Flash the light on the disk. For example:

      hwmgr -flash light -dsf /dev/disk/dsk4
       
      

      Section 3.7.2 provides information you can use to map a disk's dsk device special file name to its physical location. You need to know the disk's physical location because you will boot this disk from the system's console.

  3. On a cluster member that has access to the Tru64 UNIX operating system disk, mount the disk and save the current configuration files and license PAKs to that disk. For example:

    mount root_domain#root /mntmkdir /mnt/config_files /mnt/licensescp /cluster/admin/.member*.cfg /mnt/config_filesfor i in `lmf list | grep -v Product | awk '{print $1}'`
      do
      lmf issue /mnt/licenses/$i.license $i
      done
     
    

    Note

    You might also want to save other information such as:

    • Site-specific CAA profiles and action scripts

    • Each member's /etc/rc.config file

    • Each member's cluster alias configuration file (/etc/clu_alias.config)

    • Modifications to system configuration files

    • /etc/fstab

    • A recursive listing (ls -R) of /etc/fdmns/*

    In short, anything that has changed since you created the cluster and that you do not want to re-create. If you are not sure what to save, use the sys_check -all command to gather system configuration information. (You still have to manually save CAA profiles and scripts, cluster alias configuration files, and member-specific files.)

  4. Halt the cluster:

    shutdown -c now
     
    

  5. From the system console, boot the Tru64 UNIX operating system to multi-user mode:

    >>> boot UNIX_disk
     
    

    Note

    This procedure assumes that you have not performed a rolling upgrade of the cluster since you installed the base operating system and created the original cluster. If you have performed a rolling upgrade, the version of the base operating system on this disk is not as current as the cluster you shut down. (You can use the sizer -v command to display the version of the operating system.)

    If the operating system is not at the latest version, do the following:

    1. Take the system to single-user mode.

    2. Delete the TruCluster Server subsets.

    3. Perform an update installation to the latest version of Tru64 UNIX.

    4. Load the latest version of the TruCluster Server subsets.

    5. (Optional) If a patch kit is available for the new version of the base operating system and cluster software, you can patch the Tru64 UNIX system now — before running clu_create. This means you will not have to roll the patch kit into the cluster later.

  6. Register any required saved licenses. (The Tru64 UNIX and the TruCluster Server licenses were already active.) The following example assumes that you have run lmf list and have removed any unneeded *.license files from /licenses:

    for i in /licenses/*.license
      do
      lmf register - < $i
      donelmf reset
     
    

  7. Determine which saved configuration file to use with clu_create. Then run clu_create -c, specifying the name of the configuration file. For example:

    cd /config_filesgrep clu_create .*.cfg
    .member1.cfg:# clu_create saved configuration values:
    # /usr/sbin/clu_create -c /config_files/.member1.cfg
     
    

  8. After booting the first cluster member, use the saved member configuration files to add the remaining members to the cluster.

    Caution

    Add members in the same order and from the same host that they were added in the original cluster. Otherwise, the device names might not be the same as in the original cluster.

    Examine each configuration file to determine on which member the original clu_add_member command was run. Use the latest # date comment in each file to determine the time and the host on which the command was run. The following short script displays the name of a configuration file, the name of the host on which clu_add_member was run, and the name of the member that was added:

      #! /bin/ksh
      cd /config_files
      for i in `grep -l unix_host .member*.cfg`
        do
          print '\n' $i
          tail -21 $i | grep -E '^# date|^unix_host'
        done
     
    

    Running the script on the sample three-member cluster displays the following output:

    .member2.cfg
    # date: Tue May 15 17:46:48 EDT 2001 hostname pepicelli.zk3.dec.com
    unix_host=polishham.zk3.dec.com
     
    .member3.cfg
    # date: Tue May 15 18:09:32 EDT 2001 hostname polishham.zk3.dec.com
    unix_host=provolone.zk3.dec.com
     
    

    Using this information, you run clu_add_member with the .member2.cfg file on the first member of the cluster, pepicelli, to add member 2, polishham. After booting polishham, run clu_add_member with the .member3.cfg file on polishham to add member 3, provolone.

    For example, to add the second member of the cluster, run the following command on pepicelli:

    /usr/sbin/clu_add_member -c /config_files/.member2.cfg
     
    

    Remember to boot each new member before adding the next one.