3    Install and Configure the Tru64 UNIX Operating System

When creating a cluster, you do not need to install the Tru64 UNIX operating system on all the systems that will become cluster members. You install the Tru64 UNIX operating system only on the system that will become the first cluster member. This system will, in essence, be cloned to create the first cluster member.

You need one Tru64 UNIX system in order to run the clu_create command to configure that system as the first cluster member. After you boot this first cluster member, you run clu_add_member to create boot disks for additional members. Those members do not need individual copies of the base operating system (although they do need individual Tru64 UNIX licenses).

If your cluster will contain different types of systems, load the optional Tru64 UNIX subsets needed to support different hardware configurations. For example, because keyboards and graphics cards require specific subsets in order to work properly, load all keyboard and font subsets. You can install them later, but, if you have enough disk space on the Tru64 UNIX system, we strongly recommend that, unless prohibited by site policy, you load all subsets when installing the Tru64 UNIX operating system.

Configure the Tru64 UNIX system fully before creating a cluster, so you can verify your configuration before creating a cluster.

In addition to configuring the base operating system, load and configure the layered products and applications that you want available to the cluster.

If you plan to use Fibre Channel storagesets for the base operating system disks, read the Fibre Channel chapter in the Cluster Hardware Configuration manual.

Table 3-1 lists the installation tasks in order and references sources of necessary information.

Notes

If you are performing a rolling upgrade of a cluster, go to Chapter 7 and follow the directions in that chapter.

If you are upgrading to TruCluster Server Version 5.1B from TruCluster Production Server Software or Available Server Software Version 1.5 or Version 1.6, go to Chapter 8 and follow the directions in that chapter.

If you are upgrading to TruCluster Server Version 5.1B from TruCluster Memory Channel Software, go to Section 8.9 and follow the directions in that section.

Table 3-1:  Installing Tru64 UNIX

Task See
Make sure that all storage is properly installed and configured (for example, if you are using one or more HS controllers, make sure that RAID sets and units are configured). Cluster Hardware Configuration manual
Examine console variables. Section 2.7
Update SRM firmware. Section 3.1
Install the Tru64 UNIX operating system. Section 3.2 and the Tru64 UNIX Installation Guide
Configure basic services. Section 3.3 and the Tru64 UNIX Network Administration: Services manual
Configure Secure Shell software. Section 3.4 and the Tru64 UNIX Security Administration manual
Configure enhanced security (optional). Section 3.5 and the Tru64 UNIX Security Administration manual
Configure NetRAIN for a redundant network Interface (optional). Section 3.6 and the Cluster Administration manual
Configure the disks needed for cluster installation. Section 2.5, Section 3.7, and the Tru64 UNIX System Administration manual
For clusters that will use the LAN interconnect, get the device names of the network adapters Section 3.8

3.1    Update SRM Firmware

New versions of Tru64 UNIX and TruCluster Server usually require new versions of the AlphaServer SRM firmware. Firmware updates are located on the Alpha Systems Firmware CD-ROM, which is included in the base operating system Software Distribution Kit. To determine whether you need to update firmware, see the TruCluster Server QuickSpecs and the firmware release notes for each type of system in the cluster. Update firmware as needed before installing software. You can find the latest version of the TruCluster Server QuickSpecs at the following URL:

http://www.tru64unix.compaq.com/docs/pub_page/spds.html
 

3.2    Install the Tru64 UNIX Operating System

Note

The cluster installation copies the Tru64 UNIX root (/), /usr, and /var file systems to create the clusterwide root (/), /usr, and /var file systems. Therefore, we recommend that you fully configure the Tru64 UNIX system before creating a cluster.

Before performing the installation procedures described in the Tru64 UNIX Installation Guide, read the following list and incorporate these tasks into the installation:

  1. Make sure that all storage devices are turned on.

  2. Make sure that any hubs or switches used by the cluster interconnect are turned on.

  3. Install the Tru64 UNIX operating system on one or more disks. The disks are either private disks on the system that will become the first cluster member, or disks on a shared bus that the system can access.

    Note

    We recommend that you load all subsets when installing the Tru64 UNIX operating system.

  4. Use Advanced File System (AdvFS) file systems.

  5. If installing Worldwide Language Support (WLS) in a separate file system, we recommend that the disk containing this file system is on a bus shared by all cluster members. See the Tru64 UNIX Installation Guide — Advanced Topics manual for information on installing WLS.

  6. If you plan to use the Logical Storage Manager (LSM) to mirror file systems on the cluster, configure LSM on the Tru64 UNIX system. Put the root disk group (rootdg) on a disk on a shared bus. (Also see Section 2.5.2 in this manual and read the chapter that describes how to configure LSM for use in a cluster in the TruCluster Server Cluster Administration manual.)

  7. If you are installing a patch kit as part of the base operating system installation, load the TruCluster Server subsets before installing the patch kit. If the TruCluster Server kit is not loaded before the patch operation, patches for TruCluster Server software will not be loaded. The sequence of events when patching the initial installation of Tru64 UNIX is as follows:

    1. Install and configure the Tru64 UNIX operating system.

    2. Use the setld command to install the TruCluster Server kit.

    3. Patch the system.

    4. Use the clu_create command to create the single-member cluster.

  8. If you add new hardware (for example, additional network adapters) after you install or update the Tru64 UNIX operating system, remember to boot /genvmunix and build a customized kernel. Otherwise, the system's kernel configuration file will not contain these hardware options, and the kernel you build during TruCluster Server installation will not recognize the new hardware. The Tru64 UNIX System Administration manual provides information on configuring kernels.

  9. This step applies only to systems connected to Asynchronous Transport Mode (ATM) networks. To configure support for ATM LAN Emulation (LANE), select the necessary options from the list displayed by doconfig. In the following partial list of doconfig options, the options required for LANE support are marked with an asterisk (*):

      IP Switching over ATM (ATMIFMP)
    * LAN Emulation over ATM (LANE)
      Classical IP over ATM (ATMIP)
    * ATM UNI 3.0/3.1 Signalling for SVCs
     
    

3.3    Configure Basic Services

Using the information in the Tru64 UNIX Network Administration: Connections manual, run the netconfig utility or SysMan Menu and configure the system's standard network interfaces. Use various networking utilities like ifconfig, ping, ftp, and telnet to verify that the network is set up correctly.

Note

Do not configure the interfaces for the cluster interconnect at this time; you will configure those interfaces when creating a cluster.

Configure the following basic services:

Read the following sections and incorporate that information when configuring services.

Notes

If you choose not to configure one or more of these services before you create a cluster, see the Cluster Administration manual for information on how to configure these services in a running cluster. However, configuring the services before creating a cluster is easier.

Section 3.3.9 summarizes the preferred network configuration for the Tru64 UNIX system before beginning the TruCluster Server installation.

3.3.1    Routing Daemon

The cluster alias software is designed to work with:

The ogated routing daemon is not supported.

We recommend using the gated routing daemon. When gated is used, each cluster member's alias daemon, aliasd, creates a /etc/gated.conf.membern file for that member.

For a discussion of the role of routing daemons in a cluster and issues involved in using a routing mechanism other than gated, see the Cluster Administration manual and the Cluster Technical Overview.

3.3.2    Time Server

Running a distributed time service provides clusterwide consistency for time stamps used by the file systems and applications. We recommend that you configure a distributed time service such as the Network Time Protocol (NTP) daemon (xntpd). NTP provides highly accurate synchronization and tracks the reliability of time sources. For information on NTP, see the Tru64 UNIX Network Administration: Services manual and ntp_intro(7).

If system times are not synchronized, any checks that rely on accurate time stamps will fail. If your site does not use NTP, make sure that whatever time service you use meets the granularity specifications defined in RFC 1035 Network Time Protocol (Version 3) Specification, Implementation and Analysis.

Because the system times of cluster members must not vary by more than a few seconds, we do not recommend using the timed daemon to set the time.

If no time servers are on your network and you plan to use the cluster as a time server, see the Tru64 UNIX Network Administration: Services manual and ntp_manual_setup(7). Configure the Tru64 UNIX system as a time server before creating the cluster.

If you want the cluster to act as a reliable time source, make the time service a highly available service after creating the cluster. See the Cluster Highly Available Applications manual for information on setting up highly available services.

During cluster creation, if the Tru64 UNIX system is using NTP, the first cluster member inherits the NTP setup of the Tru64 UNIX system. When you add members, each member becomes a peer of the other members.

3.3.3    Name Server

A cluster can act as a name server, or client, or both. When the base operating system is configured as a BIND server, the BIND daemon, named, is automatically configured as a single-instance highly available service in the cluster.

When configuring a name server (for example, BIND) on the Tru64 UNIX system, make sure that the hosts entry in /etc/svc.conf has the local service listed before the bind or yp services. For example:

hosts=local,bind,yp

3.3.4    NFS

Because a TruCluster Server cluster can provide highly reliable network file system (NFS) services to clients, we recommend that you configure the base operating system as an NFS server before creating a cluster. You can also configure the system as an NFS client.

When the base operating system is configured as an NFS server, the NFS lockd and statd daemons are configured as a single-instance highly available service in the cluster. The cluster is thus a highly available NFS server.

3.3.5    NIS

If the cluster will be a Network Information Service (NIS) master, slave, or client, configure NIS before creating a cluster. See the Tru64 UNIX Network Administration: Services manual for information on configuring NIS.

All cluster members must be in the same NIS domain. If the Tru64 UNIX operating system is configured as a master server, all cluster members will be configured as NIS masters. If the Tru64 UNIX operating system is configured as a slave server or a client, all cluster members will be configured as slaves or clients.

3.3.6    DHCP

If the base operating system is configured as a Dynamic Host Configuration Protocol (DHCP) server, the DHCP daemon, joind, is configured as a single-instance, highly available service in the cluster.

Caution

Do not configure the system as a DHCP client. A member of a cluster cannot be a DHCP client, because the IP names and addresses associated with each member's cluster interconnect must be absolutely stable in order to form a cluster.

3.3.7    Mail Server

If the cluster will be a mail server, configure mail before creating a cluster. See the Tru64 UNIX System Administration manual and the TruCluster Server Cluster Administration manual for information on configuring mail.

3.3.8    Print Server

If the cluster will be a print server, configure printing before creating a cluster. See the Tru64 UNIX System Administration manual and the TruCluster Server Cluster Administration manual for information on configuring printers.

3.3.9    Network Services Summary

Table 3-2 summarizes the preferred network configuration for the Tru64 UNIX system before beginning the TruCluster Server installation. The table also lists what configuration information is passed to the first cluster member by clu_create, and to additional cluster members by clu_add_member.

Table 3-2:  Network Services Summary

Service/Daemon Recommendation Comment
Routing Daemon gated Do not use ogated or routed.
Time Server NTP

The first cluster member inherits the configuration of the base operating system. Additional members are automatically configured as peers.

Name Server Optional (BIND or NIS or both) All cluster members must be in the same domain (DNS and NIS). In svc.conf, set hosts=local,bind,yp. If the base operating system is configured as a name server, that name server will be configured as a highly available service in the cluster. If the cluster is a BIND server, the cluster alias name will be the name of the BIND server in the clusterwide /etc/resolv.conf file.
NFS Optional (set up as a server on the base operating system; can also configure as a client) All cluster members will run client versions of the lockd and statd daemons. One cluster member at a time will run the server versions of the lockd and statd daemons, which will be configured as a highly available service.
NIS Optional All cluster members must be in the same domain. If the base operating system is configured as a master server, NIS will be configured as a highly available service in the cluster. If the base operating system is configured as a slave server or a client, all cluster members will be configured as slaves or clients.
DHCP Optional (server only) If configured on the base operating system, DHCP will be configured as a highly available service in the cluster.
Mail Optional Configure before creating cluster.
Print Server Optional Configure before creating cluster.

3.4    Configure the Secure Shell Software

The Secure Shell software provides mechanisms to authenticate hosts and users as well as to encrypt data sent across the connection. You can configure the Secure Shell software so that commands such as rsh automatically use a Secure Shell connection. Because the cluster software uses rsh to send commands among cluster members, we recommend you enable the Secure Shell software.

You can configure the Secure Shell software after cluster creation, but it is easier to do so before you create the cluster.

See the Tru64 UNIX Security Administration manual for information about the Secure Shell software.

3.5    Configure Enhanced Security (Optional)

To configure enhanced security in a cluster, do the following:

See the Tru64 UNIX Security Administration manual for information on how to select and configure security options.

3.6    Configure NetRAIN for a Redundant Network Interface (Optional)

If the system has two network interfaces, you can use NetRAIN to set up a fully redundant configuration that uses both interfaces. See the Cluster Administration manual for more information.

3.7    Configure the Disks Needed for Cluster Installation

After you install and boot the Tru64 UNIX operating system, determine the sizes and locations of the disks you will use when installing the cluster. The location of member boot disks is especially important because you must boot each member from its boot disk. (A member booted from the wrong boot disk will most likely panic during the boot. Depending on the type of panic, this might affect other cluster members.)

3.7.1    Partition Sizes

Verify that the partitions on the disks you plan to use for the cluster file systems and member boot disks meet the minimum size requirements in Table 2-2. Also, use the information in Section 2.5.5 to provide space for future rolling upgrades.

The disk or disks that you plan to use for the clusterwide root (/), /usr, and /var file systems must have disk labels before you run the clu_create command. The reason is that clu_create prompts for partition information, so the partitions must exist.

Use the disklabel or diskconfig commands to label and configure disks. You can also use the SysMan Station to view and modify storage configurations.

Note

The disks used for member boot disks and the quorum disk do not have to be labeled before use.

The following example shows the output from the disklabel command for a disk whose b, g, and h partitions will hold the clusterwide root (/), /usr, and /var file systems in a two-node cluster (cylinder information is not shown):

disklabel -r dsk6
# /dev/rdisk/dsk6c:
type: SCSI
disk: RZ1DF-CB
label: 
flags: dynamic_geometry
bytes/sector: 512
sectors/track: 168
tracks/cylinder: 20
sectors/cylinder: 3360
cylinders: 5273
sectors/unit: 17773524
rpm: 7200
interleave: 1
trackskew: 28
cylinderskew: 72
headswitch: 0           # milliseconds
track-to-track seek: 0  # milliseconds
drivedata: 0
 
8 partitions:
 
#            size       offset    fstype  fsize  bsize
  a:       786432            0    unused      0      0
  b:      1552131       786432    unused      0      0
  c:     17773524            0    unused      0      0
  d:      5400220      1572864    unused      0      0
  e:      5400220      6973084    unused      0      0
  f:      5400220     12373304    unused      0      0
  g:      7595651      2338563    unused      0      0
  h:      7839310      9934214    unused      0      0
 

3.7.2    Determining Disk Locations

You can use the SysMan Station and the hwmgr command to help map device special file names to physical locations. You do not want to boot the wrong disk for a cluster member, so it is important to map a device special file to the correct device for the console that is booting the disk.

Note

If you plan to use Fibre Channel storagesets for any disks required for the cluster, read the Fibre Channel chapter in the Cluster Hardware Configuration manual.

The following general-purpose hwmgr command displays the device name, physical location, and worldwide ID (WWID) for each disk known by the Tru64 UNIX system:

hwmgr -get attr -a dev_base_name \
  -a phys_location -a name -category disk|more
 

Locating the first member's cluster boot disk is not a problem because you will run the clu_create command on the system that you are configuring as the first cluster member. The clu_create command sets the console bootdef_dev variable to the correct boot disk. However, you run the clu_add_member on one system to configure a boot disk for another system. The other system does not have an operating system; the only information you can get about its storage configuration is from its console. You need to map the dsk name that you will specify to clu_add_member to the DK name that you will use to boot this disk from the new member's console.

When adding a member, the clu_add_member command displays information about the new member's boot disk. This information includes, if known, the disk's manufacturer, model number, physical location (bus/target/logical unit number (LUN)), and serial number (WWID). The clu_add_member command uses the following hwmgr command to gather information:

hwmgr -get attr -a dev_base_name=new_member_boot_disk \
-a serial_number -a manufacturer -a model -a phys_location
 

The clu_add_member command reformats the output of the hwmgr command and displays disk location information in the following format:

Manufacturer:
Model:
Target:
Lun:
Serial Number:
 

If you are using Fibre Channel storagesets, you can use the wwidmgr -show wwid command to locate the disk from the console of the new member. (The Cluster Hardware Configuration manual provides several examples that show how to use the wwidmgr command.)

If you cannot obtain the serial number of a member boot disk, mapping a device special file name to a console device name with a high degree of confidence depends on whether or not your storage configuration is symmetrical:

3.7.2.1    Symmetrical Storage Configuration

The following procedure describes how to map a /dev/disk/dsk* special file name to the name of a disk displayed by a show device command at the new member's console. This procedure will work only when the system on which you run the hwmgr command is the same type of system as the one that will later attempt to boot the disk — the same model systems, and the same type and number of storage adapters, located in the same slots on both machines, and connected to the same shared storage. Both consoles have the same view of the bus that contains the disks you will later configure as cluster member boot disks. If you are not sure that the two systems meet these requirements, see the nonsymmetrical procedure in Section 3.7.2.2.

  1. On the Tru64 UNIX system, use the hwmgr command or the SysMan Station to map the dsk filename to the bus/target/LUN of the physical device. (The device special file names for shared storage will be the same for the Tru64 UNIX system and for the cluster.) For example, to find the bus/target/LUN for dsk13, enter:

    hwmgr -view devices | grep dsk13
    56: /dev/disk/dsk10c  DEC  RZ1CF-CF (C) DEC bus-1-targ-11-lun-0c
     
     
    

    The disk is at bus 1, target 11, and LUN 0.

  2. At the console of the system that will boot the disk, enter the show device command:

    >>> show device
         
    .
    .
    .
    dkb100.1.0.12.0 DKB100 RZ28M 1104
    .
    .
    .

    From the console's point of view, the class driver designator for a SCSI disk is DK. Bus numbering depends on firmware probe order. The first bus discovered is A, the second bus as B, and so on. In the same manner, the first target is 0, the second as 100, the third as 200, and so on. The hwmgr output of bus-1-targ-1-lun-0 usually translates to DKB100.

    Note

    If you are using Fibre Channel storagesets, You can also use the console wwidmgr -show wwid command to display the list of user-defined IDs (UDIDs) and their associated worldwide IDs (WWIDs).

  3. Write the dsk name, the DK name, and the WWID in the Member Attributes table in Table A-3.

3.7.2.2    Nonsymmetrical Storage Configuration

If your storage configuration is not symmetrical, no command can unambiguously map one system's operating system view of storage (dsk) to another system's console view of storage (DK).

You can decide whether to map from the console to the disk to the operating system or vice versa. The following procedure starts from the operating system and works back to the console of the other system:

  1. Use the hwmgr command to flash the activity light on the disk you want to use as a boot disk. For example, to flash the light on the disk accessed through special file /dev/disk/dsk13, enter:

    hwmgr -flash light -dsf dsk13
     
    

  2. Locate the disk by its flashing light and note its position in its storage cabinet.

  3. Trace the cable from the cabinet back to the system that will use this disk as its boot disk.

    You now know the location of the disk on its SCSI bus, and the location of the SCSI adapter in the system. If you know the adapter numbering scheme for the system, you can deduce the SCSI bus number for the adapter. (If you drew a storage map when you configured your cluster, that information will be useful. If you have a variety of disks, you can also use the disk's model number and WWID as other pieces of evidence.)

    The following example assumes that the disk you tracked is the second device on the second SCSI adapter for this system.

    At the console of the system that will boot the disk, enter the show device command:

    >>> show device
         
    .
    .
    .
    dkb100.1.0.12.0 DKB100 RZ28M 1104
    .
    .
    .

    From the console's point of view, the class driver designator for a SCSI disk is DK. Bus numbering depends on firmware probe order. The first bus discovered is A, the second bus is B, and so on. In the same manner, the first target is 0, the second as 100, the third as 200, and so on. The second target on the second SCSI adapter would be DKB100. If you know that the disk is an RZ28M, that information also helps to narrow down the choices (unless, of course, all the disks on the bus are RZ28Ms).

    Note

    If you are using Fibre Channel storagesets, you can use the console wwidmgr -show wwid command to display a list of user-defined IDs (UDIDs) and their associated worldwide IDs (WWIDs).

  4. Write the dsk name, the DK name, and the WWID in the Member Attributes table in Table A-3.

3.8    If Using LAN Interconnect, Obtain the Device Names of the Network Adapters

Obtain the names of eligible Ethernet network adapters on the member to be configured before issuing the clu_create or clu_add_member command. To be eligible, an adapter must be set up as follows:

The cluster installation commands accept the names of either physical Ethernet network adapters or NetRAIN virtual interfaces.

Caution

The cluster installation commands automatically configure the NetRAIN virtual interfaces for the LAN interconnect. Do not manually create the NetRAIN devices prior to running the clu_create script. See the Cluster Administration manual for a discussion of the consequences of doing so.

To learn the device names of eligible network adapters, run the ifconfig -a command on the system that will become the first member of the cluster. Use the hwmgr -get attr -cat network command to determine their speed and transmission mode.

To learn the device names for systems that you intend to add to the cluster, you must first boot the system from the Tru64 UNIX Operating System Volume 1 CD-ROM. The UNIX device names of the Ethernet adapters scroll on the console during the boot process. If you enter a UNIX shell after the system boots, you can enter an ifconfig -a command to list the network adapter device names and the hwmgr -get attr -cat network command to list their properties.