7 Rolling Upgrade

A rolling upgrade is a software upgrade of a cluster that is performed while the cluster is in operation. One member at a time is upgraded and returned to operation while the cluster transparently maintains a mixed-version environment for the base operating system, cluster, and Worldwide Language Support (WLS) software. Clients accessing services are not aware that a rolling upgrade is in progress.

A rolling upgrade consists of an ordered series of steps, called stages. The commands that control a rolling upgrade enforce this order.

This first part of the chapter contains instructions for performing a rolling upgrade, for displaying the status of a rolling upgrade, and for undoing one or more stages of a rolling upgrade. Those interested in how a rolling upgrade works can find the details in Section 7.6 and the sections that follow it.

This chapter discusses the following topics:

Tasks, and combinations of tasks, you can perform during a single rolling upgrade (Section 7.1)

Tasks you cannot perform during a rolling upgrade (Section 7.2)

How to perform a rolling upgrade (Section 7.3)

How to display the status of a rolling upgrade (Section 7.4)

How to undo the stages of a rolling upgrade (Section 7.5)

The commands used during a rolling upgrade (Section 7.6)

Rolling upgrade stages (Section 7.7)

Two mechanisms that support rolling upgrades: tagged files (Section 7.8) and version switches (Section 7.9)

Rolling upgrade and layered products (Section 7.10)

Rolling upgrade and RIS (Section 7.11)

Figure 7-1 provides a simplified flow chart of the tasks and stages that are part of a rolling upgrade initiated on a Version 5.1B cluster:

Figure 7-1: Rolling Upgrade Flow Chart

7.1 Rolling Upgrade Supported Tasks

The tasks that you can perform during a rolling upgrade depend on which versions of the base operating system and cluster software are currently running on the cluster. The main focus of this chapter is to describe the behavior of a rolling upgrade that starts on a TruCluster Server Version 5.1B cluster. However, because you may read this chapter in preparation for a rolling upgrade from TruCluster Server Version 5.1A to Version 5.1B, we point out rolling upgrade differences between the two versions.

The following list describes the basic tasks you can perform within a rolling upgrade:

Upgrade the cluster's Tru64 UNIX base operating system and TruCluster Server software. You perform this type of rolling upgrade to upgrade from the installed version to the next version.
When performing a rolling upgrade of the base operating system and cluster software, you can roll only from one version to the next version. You cannot skip versions. See Table 1-1 for a list of supported upgrade paths.)

Note

A rolling upgrade updates the file systems and disks that the cluster currently uses. The roll does not update the disk or disks that contain the Tru64 UNIX operating system used to create the cluster (the operating system on which you ran clu_create). Although you can boot the original operating system in an emergency when the cluster is down, remember that the differences between the current cluster and the original operating system increase with each cluster update.

Patch the cluster's current versions of the Tru64 UNIX base operating system and TruCluster Server software.

Install a New Hardware Delivery (NHD) kit (the cluster must be running TruCluster Server Version 5.1A or later).

Rolling in a patch kit or an NHD kit uses the same procedure as rolling in a new release of the base operating system and cluster software. The difference is which commands you run during the install stage:

To upgrade the base operating system and cluster software, run installupdate in the install stage.

To roll in a patch kit, run dupatch in the install stage. You can invoke dupatch multiple times in the install stage to roll in multiple patch kits.
If you want to perform a no-roll patch of the cluster, do not run the clu_upgrade command. Instead run the dupatch command from a cluster member running in multiuser mode.
No-roll patching applies patches quickly and reduces the number of reboots required. It patches the cluster in one operation. However, it requires a reboot of the whole cluster to complete the operation, so the cluster is unavailable for a period. For more information, see the Patch Kit Installation Instructions that came with the patch kit you want to install.

To install an NHD kit, run nhd_install in the install stage.

Throughout this chapter, the term rolling upgrade refers to the overall procedure used to roll one or more software kits into a cluster.

As shown in Figure 7-1, you can perform more than one task during a rolling upgrade.

If the cluster is running Version 5.1A or Version 5.1B, a rolling upgrade can include the task combinations listed in Table 7-1:

Table 7-1: Rolling Upgrade Tasks Supported by Version 5.1A and Version 5.1B

An update installation from Version 5.1A to Version 5.1B

An update installation from Version 5.1B to the next release

A patch of Version 5.1A

A patch of Version 5.1B

The installation of a New Hardware Delivery (NHD) kit onto a Version 5.1A cluster

The installation of an NHD kit onto a Version 5.1B cluster

An update installation from Version 5.1A to Version 5.1B of the base operating system and cluster software, followed by a patch of Version 5.1B

An update installation from Version 5.1B to the next release of the base operating system and cluster software followed by a patch of the next release ^{[Footnote 5]}

An NHD installation onto a Version 5.1A cluster followed by a patch of Version 5.1A

An NHD installation onto a Version 5.1B cluster followed by a patch of Version 5.1B

An update installation from Version 5.1A to Version 5.1B followed by the installation of an NHD kit for Version 5.1B

An update installation from Version 5.1B to the next release of the base operating system and cluster software followed by the installation of an NHD kit for that next release ^{[Footnote 6]}

An update installation from Version 5.1A to Version 5.1B, followed by the installation of an NHD kit for Version 5.1B, followed by a patch of Version 5.1B

An update installation from Version 5.1B to the next release, followed by the installation of an NHD kit for the next release, followed by a patch of the next release ^{[Footnote 6]}

7.2 Unsupported Tasks

The following list describes tasks that you cannot perform or that we recommend you do not attempt during a rolling upgrade:

Do not remove or modify files in the /var/adm/update directory. The files in this directory are critical to the roll. Removing them can cause a rolling upgrade to fail.

During the install stage, you cannot run a dupatch command followed by an installupdate command. To patch the current software before you perform a rolling upgrade, you must perform two complete rolling upgrade operations: one to patch the current software, and one to perform the update installation.

You cannot bypass versions when performing a rolling upgrade of the base operating system and cluster software. You can only roll from one version to the next version. For supported upgrade paths, see Table 1-1.

Do not use the /usr/sbin/setld command to add or delete any of the following subsets:
- Base Operating System subsets (those with the prefix OSF).
- TruCluster Server subsets (those with the prefix TCR).
- Worldwide Language Support (WLS) subsets (those with the prefix IOSWW).
Adding or deleting these subsets during a roll creates inconsistencies in the tagged files.

Do not install a layered product during the roll.
Unless a layered product's documentation specifically states that you can install a newer version of the product on the first rolled member, and that the layered product knows what actions to take in a mixed-version cluster, we strongly recommend that you do not install either a new layered product or a new version of a currently installed layered product during a rolling upgrade.
For more information about layered products and rolling upgrades, see Section 7.10.

7.3 Rolling Upgrade Procedure

In the procedure in this section, unless otherwise stated, run commands in multiuser mode. Each step that corresponds to a stage refers to the section that describes that stage in detail. We recommend that you read the detailed description of stages in Section 7.7 before performing the rolling upgrade procedure.

Some stages of a rolling upgrade take longer to complete than others. Table 7-2 lists the approximate time it takes to complete each stage.

Table 7-2: Time Estimates for Rolling Upgrade Stages

Stage	Duration
Preparation	Not under program control.
Setup	45 - 120 minutes. ^{[Footnote 7]}
Preinstall	15 - 30 minutes. ^{[Footnote 7]}
Install	The same amount of time it takes to run `installupdate`, `dupatch`, `nhd_install`, or a supported combination of these commands on a single system.
Postinstall	Less than 1 minute.
Roll (per member)	Patch: less than 5 minutes. Update installation: about the same amount of time it takes to add a member. ^{[Footnote 8]}
Switch	Less than 1 minute.
Clean	30 - 90 minutes. ^{[Footnote 7]}

You can use the following procedure to upgrade a TruCluster Server Version 5.1A cluster to Version 5.1B, and to upgrade a cluster that is already at Version 5.1B.

Prepare the cluster for the rolling upgrade (Section 7.7.1):
1. Choose one cluster member to be the lead member (the first member to roll). (The examples in this procedure use a member whose memberid is 2 as the lead member. The example member's hostname is provolone.)
2. Back up the cluster.
3. If you will perform an update installation during the install stage, remove any blocking layered products, listed in Table 7-6, that are installed on the cluster.
4. To determine whether the cluster is ready for an upgrade, run the clu_upgrade -v check setup lead_memberid command on any cluster member. For example:
```
# clu_upgrade -v check setup 2
 
```
  If a file system needs more free space, use AdvFS utilities such as addvol to add volumes to domains as needed. For disk space requirements, see Section 7.7.1. For information on managing AdvFS domains, see the Tru64 UNIX AdvFS Administration manual.
5. Verify that each system's firmware will support the new software. Update firmware as needed before starting the rolling upgrade.

Perform the setup stage (Section 7.7.2).

Notes

If your current cluster is at Version 5.1A or later and if you plan to upgrade the base operating system and cluster software during the install stage, mount the device or directory that contains the new TruCluster Server kit before running clu_upgrade setup. The setup command will copy the kit to the /var/adm/update/TruClusterKit directory.
If your current cluster is at Version 5.1A or later and if you plan to install an NHD kit during the install stage, mount the device or directory that contains the new NHD kit before running clu_upgrade setup. The setup command will copy the kit to the /var/adm/update/NHDKit directory.

On any member, run the clu_upgrade setup lead_memberid command. For example:
```
# clu_upgrade setup 2
 
```
Section 7.7.2 shows the menu displayed by the clu_upgrade command.
When the setup stage is completed, clu_upgrade prompts you to reboot all cluster members except the lead member.

One at a time, reboot all cluster members except the lead member. Do not start the preinstall stage until these members are either rebooted or halted.

Perform the preinstall stage (Section 7.7.3).
On the lead member, run the following command:
```
# clu_upgrade preinstall
 
```
If your current cluster is at Version 5.1A or later, the preinstall command gives you the option of verifying or not verifying the existence of the tagged files created during the setup stage.
- If you have just completed the setup stage and have done nothing to cause the deletion any of the tagged files, you can skip this test.
- If you completed the setup stage a while ago and are not sure what to do, let preinstall test the correctness of the tagged files.

Perform the install stage (Section 7.7.4).

Note

During the install stage you load the new software on the lead member, in effect rolling that member. When you perform the roll stage, this new software is propagated to the remaining members of the cluster.
The clu_upgrade command does not load software during the install stage. The loading of software is controlled by the commands you run: installupdate, dupatch, or nhd_install.
See Table 7-1 for the list of rolling upgrade tasks and combination of tasks supported for Version 5.1A and Version 5.1B.
1. See the Tru64 UNIX Installation Guide for detailed information on using the installupdate command.
  See the Tru64 UNIX and TruCluster Server Patch Kit Installation Instructions that came with your patch kit for detailed information on using the dupatch command.
  See the Tru64 UNIX New Hardware Delivery Release Notes and Installation Instructions that came with your NHD kit for detailed information on using the nhd_install command.
2. If the software you are installing requires that its installation command be run from single-user mode, halt the system and boot the system to single-user mode:
```
# shutdown -h now
>>> boot -fl s
 
```
  Note
  
  Halting and booting the system ensures that it provides the minimal set of services to the cluster and that the running cluster has a minimal reliance on the member running in single-user mode. In particular, halting the member satisfies services that require the cluster member to have a status of DOWN before completing a service failover. If you do not first halt the cluster member, services will probably not fail over as expected.
  
  When the system reaches single-user mode, run the following commands:
```
# init s
# bcheckrc
# lmf reset
 
```
3. Run the installupdate, dupatch, or nhd_install command.
  To roll in multiple patch kits, you can invoke dupatch multiple times in a single install stage. Be aware that doing so may make it difficult to isolate problems should any arise after the patch process is completed and the cluster is in use.
  You cannot run a dupatch command followed by an installupdate command. To patch the current software before you perform a rolling upgrade, you must perform two complete rolling upgrade operations: one to patch the current software, and one to perform the update installation.

(Optional) After the lead member performs its final reboot with its new custom kernel, you can perform the following manual tests before you roll any additional members:
1. Verify that the newly rolled lead member can serve the shared root (/) file system.
  1. Use the cfsmgr command to determine which cluster member is currently serving the root file system. For example:
```
# cfsmgr -v -a server /
 
 Domain or filesystem name = /
 Server Name = polishham
 Server Status : OK
 
```
  2. Relocate the root (/) file system to the lead member. For example:
```
# cfsmgr -h polishham -r -a SERVER=provolone /
 
```
2. Verify that the lead member can serve applications to clients. Make sure that the lead member can serve all important applications that the cluster makes available to its clients.
  You decide how and what to test. We suggest that you thoroughly exercise critical applications and satisfy yourself that the lead member can serve these applications to clients before continuing the roll. For example:
  - Manually relocate CAA services to the lead member. For example, to relocate the application resource named cluster_lockd to lead member provolone:
```
# caa_relocate cluster_lockd -c provolone
 
```
  - Temporarily modify the default cluster alias selection priority attribute, selp, to force the lead member to serve all client requests directed to that alias. For example:
```
# cluamgr -a alias=DEFAULTALIAS,selp=100
 
```
    The lead member is now the end recipient for all connection requests and packets addressed to the default cluster alias.
    From another member or from an outside client, use services such as telnet and ftp to verify that the lead member can handle alias traffic. Test client access to all important services that the cluster provides.
    When you are satisfied, reset the alias attributes on the lead member to their original values.

Perform the postinstall stage (Section 7.7.5).
On the lead member, run:
```
# clu_upgrade postinstall
 
```

Perform the roll stage (Section 7.7.6).
Roll the members of the cluster that have not already rolled. ^{[Footnote 9]}
You can roll multiple members simultaneously (parallel roll), subject to the restriction that the number of members not being rolled (plus the quorum disk, if one is configured) is sufficient to maintain cluster quorum.
To roll a member, do the following:
1. Halt the member system and boot it to single-user mode. For example:
```
# shutdown -h now
>>> boot -fl s
 
```
2. When the system reaches single-user mode, run the following commands:
```
# init s
# bcheckrc
# lmf reset
 
```
3. Roll the member:
```
# clu_upgrade roll
 
```
  If you are performing parallel rolls, use the -f option with the clu_upgrade roll command. This option causes the member to automatically reboot without first prompting for permission:
```
# clu_upgrade -f roll
 
```
  The roll command verifies that rolling the member will not result in a loss of quorum. If a loss of quorum will result, then the roll of the member does not occur and an error message is displayed. You can roll the member later, after one of the currently rolling members has rejoined the cluster and its quorum vote is available.
  If the roll proceeds, the member is prepared for a reboot. If you used the -f option, no prompt is displayed; the reboot occurs automatically. If you did not use the -f option, clu_upgrade displays a prompt that asks whether you want to reboot at this time. Unless you want to examine something specific before you reboot, enter yes. (If you enter yes, it may take approximately half a minute before the actual reboot occurs.)
  Perform parallel rolls to minimize the time needed to complete the roll stage. For example, on an eight-member cluster with a quorum disk, after rolling the lead member, you can roll four members in parallel.
  1. Begin the roll stage on a member. (The lead member was rolled during the install stage. You do not perform the roll stage on the lead member.)
  2. When you see a message similar to the following, begin the roll stage on the next member:
```
   *** Info ***
You may now begin the roll of another cluster member.
```
    If you see a message that begins like the following, it is probably caused by the number of currently rolling members that contribute member votes.
```
  *** Info ***
The current quorum conditions indicate that beginning
a roll of another member at this time may result in
the loss of quorum.
```
    In this case, you have the following options:
    - You can wait until a member completes the roll stage before you begin to roll the next member.
    - If there is an unrolled member that does not contribute member votes, you can begin the roll stage on it.
4. Continue to roll members until all members of the cluster have rolled. Before starting each roll stage, wait until you see the message that it is all right to do so.
  When you roll the last member, you will see a message similar to the following:
```
  *** Info ***
This is the last member requiring a roll.
```
Note

The roll actually takes place during the reboot. The clu_upgrade roll command sets up the it(8) scripts that will be run during the reboot. When you reboot, the it scripts roll the member, build a customized kernel, and then reboot again so the member will be running on its new customized kernel. When the member boots its new customized kernel, it has completed its roll and is no longer running on tagged files.

Perform the switch stage (Section 7.7.7).
After all members have rolled, run the switch command on any member.
```
# clu_upgrade switch
 
```

One at a time, reboot each member of the cluster.

Perform the clean stage (Section 7.7.8).
Run the following command on any member to remove the tagged (.Old..) files from the cluster and complete the upgrade.
```
# clu_upgrade clean
 
```

7.4 Displaying the Status of a Rolling Upgrade

The clu_upgrade command provides the following options for displaying the status of a rolling upgrade. You can run status commands at any time.

To display the overall status of a rolling upgrade: clu_upgrade -v or clu_upgrade -v status.

To determine whether you can run a stage: clu_upgrade check [stage]. If you do not specify a stage, clu_upgrade tests whether the next stage can be run.

To determine whether a stage has started or completed: clu_upgrade started stage or clu_upgrade completed stage.

To determine whether a member has rolled: clu_upgrade check roll memberid.

To verify whether tagged files have been created for a layered product: clu_upgrade tagged check [prod_code [prod_code ...]]. If you do not specify a product code, clu_upgrade inspects all tagged files in the cluster.

Notes

During a roll, there might be two versions of the clu_upgrade command in the cluster — an older version used by members that have not yet rolled, and a newer version (if included in the update distribution or patch kit). The information that is displayed by the status command might differ depending on whether the command is run on a member that has rolled. Therefore, if you run the status command on two members, do not be surprised if the format of the displayed output is not the same.
If you run clu_upgrade status after running installupdate, clu_upgrade will display a message indicating that the install stage is complete. However, the install stage is not really complete until you run the clu_upgrade postinstall command.

7.5 Undoing a Stage

The clu_upgrade undo command provides the ability to undo a rolling upgrade that has not completed the switch stage. You can undo any stage except the switch stage and the clean stage. You must undo stages in order; for example, if you decide to undo a rolling upgrade after completing the preinstall stage, you undo the preinstall stage and then undo the setup stage.

Note

Before undoing any stage, we recommend that you read the relevant version of the Cluster Release Notes to determine whether there are restrictions related to the undoing of any stage.

To undo a stage, use the undo command with the stage that you want to undo. The clu_upgrade command determines whether the specified stage is a valid stage to undo. Table 7-3 outlines the requirements for undoing a stage:

Table 7-3: Undoing a Stage

Stage to Undo	Command	Comments
Setup	`clu_upgrade undo setup`	You must run this command on the lead member. In addition, no members can be running on tagged files when you undo the setup stage. Before you undo the setup stage, use the `clu_upgrade -v status` command to determine which members are running on tagged files. Then use the `clu_upgrade tagged disable memberid` command to disable tagged files on those members. (See Section 7.8 for information about tagged files and the commands used to manipulate them.) When no members are running on tagged files, run the `clu_upgrade undo setup` command on the lead member.
Preinstall	`clu_upgrade undo preinstall`	You must run this command on the lead member.
Install	`clu_upgrade undo install`	You can run this command on any member except the lead member. Halt the lead member. Then run the `clu_upgrade undo install` command on any member that has access to the halted lead member's boot disk. When the command completes, boot the lead member.
Postinstall	`clu_upgrade undo postinstall`	You must run this command on the lead member.
Roll	`clu_upgrade undo roll memberid`	You can run this command on any member except the member whose roll stage will be undone. Halt the member whose roll stage is being undone. Then run the `clu_upgrade undo roll memberid` command on any other member that has access to the halted member's boot disk. When the command completes, boot the halted member. The member will now be using tagged files.

7.6 Rolling Upgrade Commands

The clu_upgrade command, described in clu_upgrade(8), controls the overall flow of a rolling upgrade and ensures that the stages are run in order. During the install stage, you run one or more of installupdate, dupatch, or nhd_install to load and install software. These commands are rolling upgrade aware; they are modified to understand which actions they are allowed to take during the install and roll stages of a rolling upgrade.

When you start a rolling upgrade, the cluster is running the software from the previous release. For the first part of any rolling upgrade, you are running the clu_upgrade command that is already installed on the cluster. If a new version is installed during the rolling upgrade, there may be minor differences in the on-screen display and behavior between the two versions of the command.

The following two tables show at which stages during a rolling upgrade new versions of upgrade commands, if shipped with the kits being installed, become available during a rolling upgrade: ^{[Footnote 10]}

Table 7-4 maps commands to stages for a rolling upgrade from Version 5.1A to Version 5.1B, a patch kit, or an NHD kit; or to Version 5.1B of the base operating system and cluster software followed by a patch of the new software within the same rolling upgrade.

Table 7-5 maps commands to stages for a rolling upgrade from Version 5.1B to the next release of the operating system and cluster software, a Version 5.1B patch kit, or an NHD kit; or to the next release of the base operating system and cluster software followed by a patch of the new software within the same rolling upgrade.

Table 7-4: Stages and clu_upgrade Versions When Performing a Rolling Upgrade from Version 5.1A

Stage	Version 5.1A	Next Release ^{[Footnote 11]}	Comments
Preparation	X		The currently installed (old) version of `clu_upgrade` is always run in this stage.
Setup	X		The currently installed (old) version of `clu_upgrade` is always run in this stage. If performing an update installation, the new version of the `clu_upgrade` is extracted from the TruCluster Server kit and installed at `/usr/sbin/clu_upgrade`, replacing the old version. Because this replacement is done before tagged files are created, all members will use the new `clu_upgrade` throughout the remainder of the rolling upgrade.
Preinstall		X	If the rolling upgrade includes an update installation, all members use the new version of `clu_upgrade` installed during the setup stage. (Otherwise, members continue to run the current version of `clu_upgrade`.)
Install		X	If the rolling upgrade includes an update installation, all members use the version of `clu_upgrade` installed during the setup stage. During the update installation, a new version of `installupdate` replaces the old one. A patch kit always installs the latest version of `dupatch`. If performing a patch, and if the patch kit includes a new version of `clu_upgrade`, the new version is installed and will be used by all cluster members starting with the postinstall stage.
Postinstall		X	If a new version of `clu_upgrade` was installed in either the setup stage or the install stage, all members use the new version.
Roll		X	If a new version of `clu_upgrade` was installed in either the setup stage or the install stage, all members use the new version.
Switch		X	If a new version of `clu_upgrade` was installed in either the setup stage or the install stage, all members use the new version.
Clean		X	If a new version of `clu_upgrade` was installed in either the setup stage or the install stage, all members use the new version.

Table 7-5: Stages and clu_upgrade Versions When Performing a Rolling Upgrade from Version 5.1B

Stage	Version 5.1B	Next Release ^{[Footnote 12]}	Comments
Preparation	X		The currently installed (old) version of `clu_upgrade` is always run in this stage.
Setup	X		The currently installed (old) version of `clu_upgrade` is always run in this stage. If performing an update installation, the new version of the `clu_upgrade` is extracted from the TruCluster Server kit and installed at `/usr/sbin/clu_upgrade`, replacing the old version. Because this replacement is done before tagged files are created, all members will use the new `clu_upgrade` throughout the remainder of the rolling upgrade.
Preinstall		X	If the rolling upgrade includes an update installation, all members use the new version of `clu_upgrade` installed during the setup stage. (Otherwise, members continue to run the current version of `clu_upgrade`.)
Install		X	If the rolling upgrade includes an update installation, all members use the version of `clu_upgrade` installed during the setup stage. During the update installation, a new version of `installupdate` replaces the old one. A patch kit always installs the latest version of `dupatch`. If performing a patch, and if the patch kit includes a new version of `clu_upgrade`, the new version is installed and will be used by all cluster members starting with the postinstall stage.
Postinstall		X	If a new version of `clu_upgrade` was installed in either the setup stage or the install stage, all members use the new version.
Roll		X	If a new version of `clu_upgrade` was installed in either the setup stage or the install stage, all members use the new version.
Switch		X	If a new version of `clu_upgrade` was installed in either the setup stage or the install stage, all members use the new version.
Clean		X	If a new version of `clu_upgrade` was installed in either the setup stage or the install stage, all members use the new version.

7.7 Rolling Upgrade Stages

The following sections describe each of the rolling upgrade stages.

Note

These sections only describe the stages. Use the procedure in Section 7.3 to perform a rolling upgrade.

Preparation stage (Section 7.7.1)

Setup stage (Section 7.7.2)

Preinstall stage (Section 7.7.3)

Install stage (Section 7.7.4)

Postinstall stage (Section 7.7.5)

Roll stage (Section 7.7.6)

Switch stage (Section 7.7.7)

Clean stage (Section 7.7.8)

7.7.1 Preparation Stage

Command	Where Run	Run Level
`clu_upgrade -v check setup lead_memberid`	any member	multiuser mode

During the preparation stage, you back up all important cluster data and verify that the cluster is ready for a roll. Before beginning a rolling upgrade, do the following:

Choose one member of the cluster as the first member to roll. This member, known as the lead member, must have direct access to the root (/), /usr, /var, and, if used, i18n file systems.
Make sure that the lead member can run any critical applications. You can test these applications after you update this member during the install stage, but before you roll any other members. If a problem occurs, you can try to resolve it on this member before you continue. If you cannot resolve a problem, you can undo the rolling upgrade and return the cluster to its pre-roll state. (Section 7.5 describes how to undo rolling upgrade stages.)

Back up the clusterwide root (/), /usr, and /var file systems, including all member-specific files in these file systems. If the cluster has a separate i18n file system, back up that file system. In addition, back up any other file systems that contain critical user or application data.

Note

If you perform an incremental or full backup of the cluster during a rolling upgrade, make sure to perform the backup on a member that is not running on tagged files. If you back up from a member that is using tagged files, you will only back up the contents of the .Old.. files. Because the lead member never uses tagged files, you can back up the cluster from the lead member (or any other member that has rolled) during a rolling upgrade.
Most sites have automated backup procedures. If you know that an automatic backup will take place while the cluster is in the middle of a rolling upgrade, make sure that backups are done on the lead member or on a member that has rolled.

If you plan to run the installupdate command in the install stage, remove any blocking layered products listed in Table 7-6 that are installed on the cluster.

Run the clu_upgrade -v check setup lead_memberid command, which verifies the following information:
- No rolling upgrade is in progress.
- All members are running the same versions of the base operating system and cluster software.
- No members are running on tagged files.
- There is adequate free disk space.

Verify that each system's firmware will support the new software. Update firmware as needed before starting the rolling upgrade.

A cluster can continue to operate during a rolling upgrade because two copies exist of the operating system and cluster software files. (Only one copy exists of shared configuration files so that changes made by any member are visible to all members.) This approach makes it possible to run two different versions of the base operating system and the cluster software at the same time in the same cluster. The trade-off is that, before you start an upgrade, you must make sure that there is adequate free space in each of the clusterwide root (/), /usr, and /var file systems, and, if a separate domain exists for the Worldwide Language Support (WLS) subsets, in the i18n file system.

A rolling upgrade has the following disk space requirements:

At least 50 percent free space in root (/), cluster_root#root.

At least 50 percent free space in /usr, cluster_usr#usr.

At least 50 percent free space in /var, cluster_var#var, plus, if updating the operating system, an additional 425 MB to hold the subsets for the new version of the base operating system.

If a separate i18n domain exists for the WLS subsets, at least 50 percent free space in that domain.

No tagged files are placed on member boot partitions. However, programs might need free space when moving kernels to boot partitions. We recommend that you reserve at least 50 MB free space on each member's boot partition.

Note

You cannot use the addvol command to add volumes to a member's root domain (the a partition on the member's boot disk). Instead, you must delete the member from the cluster, use diskconfig or SysMan to configure the disk appropriately, and then add the member back into the cluster.

If installing a patch kit, see the Patch Kit Installation Instructions that came with your patch kit to find the amount of space you will need to install that kit. If installing an NHD kit, see the New Hardware Delivery Release Notes and Installation Instructions that came with your NHD kit to find the amount of space you will need to install that kit.

If a file system needs more free space, use AdvFS utilities such as addvol to add volumes to domains as needed. For information on managing AdvFS domains, see the Tru64 UNIX AdvFS Administration manual. (The AdvFS Utilities require a separate license.) You can also expand the clusterwide root (/) domain.

Note

The clu_upgrade command verifies whether sufficient space exists at the start of a rolling upgrade. However, nothing prevents a cluster member from consuming disk space during a rolling upgrade, thus creating a situation where a later stage might not have enough disk space.
Disk space is dynamic. If you know that a member will be consuming disk space during a rolling upgrade, add additional space before you start the upgrade.

7.7.2 Setup Stage

Command	Where Run	Run Level
`clu_upgrade setup lead_memberid`	any member	multiuser mode

The setup stage performs the clu_upgrade check setup command, creates tagged files, and prepares the cluster for the roll.

The clu_upgrade setup lead_memberid command performs the following tasks:

Creates the rolling upgrade log file, /cluster/admin/clu_upgrade.log. (Section D.3 contains a sample clu_upgrade.log file.)

Makes the -v check setup tests listed in Section 7.7.1.

Prompts you to indicate whether to perform an update installation, install a patch kit, install an NHD kit, or a combination thereof. The following example shows the menu displayed by the TruCluster Server Version 5.1B clu_upgrade command:

What type of rolling upgrade will be performed?
 
Selection   Type of Upgrade
---------------------------------------------------------------
   1      An upgrade using the installupdate command
   2      A patch using the dupatch command
   3      A new hardware delivery using the nhd_install command
   4      All of the above
   5      None of the above
   6      Help
   7      Display all options again
---------------------------------------------------------------
Enter your Choices (for example, 1 2 2-3):

If you specify an update installation, copies the relevant kits onto disk:
- If performing an update installation, copies the cluster kit to /var/adm/update/TruClusterKit so that the kit will be available to the installupdate command during the install stage. (The installupdate command copies the operating system kit to /var/adm/update/OSKit during the install stage.) The clu_upgrade command prompts for the absolute pathname for the TruCluster Server kit location. On a TruCluster Server Version 5.1B cluster, when performing a rolling upgrade that includes an update installation, remember to mount the TruCluster Server kit before running the clu_upgrade setup command.
- On a TruCluster Server Version 5.1B cluster, if performing an NHD installation, uses the nhd_install command to copy the NHD kit to /var/adm/update/NHDKit
Caution

The files in /var/adm/update are critical to the roll process. Do not remove or modify files in this directory. Doing so can cause a rolling upgrade to fail.

Creates the mandatory set of tagged files for the OSF (base), TCR (cluster), and IOS (Worldwide Language Support) products.

Caution

If, for any reason, during an upgrade you need to create tagged files for a layered product, see Section 7.8.

Sets the sysconfigtab variable rolls_ver_lookup=1 on all members except the lead member. When rolls_ver_lookup=1, a member uses tagged files. As a result, the lead member can upgrade while the remaining members run on the .Old.. files from the current release.

Prompts you to reboot all cluster members except the lead member. When the setup command completes, reboot these members one at a time so that the cluster can maintain quorum. This reboot is required for each member that will use tagged files in the mixed-version cluster. When the reboots complete, all members except the lead member are running on tagged files.

7.7.3 Preinstall Stage

Command	Where Run	Run Level
`clu_upgrade preinstall`	lead member	multiuser mode

The purpose of the preinstall stage is to verify that the cluster is ready for the lead member to run one or more of the installupdate, dupatch, or nhd_install commands.

The clu_upgrade preinstall command performs the following tasks:

Verifies that the command is being run on the lead member, that the lead member is not running on tagged files, and that any other cluster members that are up are running on tagged files.

(Optional) Verifies that tagged files are present, that they match their product's inventory files, and that each tagged file's AdvFS property is set correctly. (This process can take a while, but not as long as it does to create the tagged files in the setup stage. Table 7-2 provides time estimates for each stage.)

Makes on-disk backup copies of the lead member's member-specific files.

7.7.4 Install Stage

Command	Where Run	Run Level
`installupdate`	lead member	single-user mode
`dupatch`	lead member	single-user or multiuser mode
`nhd_install`	lead member	single-user mode

If your current cluster is running TruCluster Server Version 5.1B or Version 5.1A, you can perform one of the tasks or combinations of tasks listed in Table 7-1.

The install stage starts when the clu_upgrade preinstall command completes, and continues until you run the clu_upgrade postinstall command.

Note

If you run clu_upgrade status after running installupdate, clu_upgrade displays a message indicating that the install stage is complete. However, the install stage is not really complete until you run the clu_upgrade postinstall command.

The lead member must be in single-user mode to run the installupdate command or the nhd_install command; single-user mode is recommended for the dupatch command. When taking the system to single-user mode, you must halt the system and then boot it to single-user mode.

When the system is in single-user mode, run the init s, bcheckrc, and lmf reset commands before you run the installupdate, dupatch, or nhd_install commands. See the Tru64 UNIX Installation Guide, the Tru64 UNIX and TruCluster Server Patch Kit Installation Instructions, and the Tru64 UNIX New Hardware Delivery Release Notes and Installation Instructions for information on how to use these commands.

Notes

You can run the dupatch command multiple times in order to install multiple patches. Doing so may make isolating problems difficult if any arise after the patch process is completed and the cluster is in use.
During the install stage, you cannot run a dupatch command followed by an installupdate command. To patch the current software before you perform a rolling upgrade, you must perform two complete rolling upgrade operations: one to patch the current software, and one to perform the update installation.
If an NHD installation is part of a rolling upgrade that includes an update installation, you do not have to manually run nhd_install; the installupdate command will install the NHD kit. Otherwise, use the nhd_install command copied by clu_upgrade during the setup stage: /var/adm/update/NHDKit/nhd_install.

7.7.5 Postinstall Stage

Command	Where Run	Run Level
`clu_upgrade postinstall`	lead member	multiuser mode

The postinstall stage verifies that the lead member has completed an update installation, a patch, or an NHD installation. If an update installation was performed, clu_upgrade postinstall verifies that the lead member has rolled to the new version of the base operating system.

7.7.6 Roll Stage

Command	Where Run	Run Level
`clu_upgrade roll`	member being rolled	single-user mode

The lead member was upgraded in the install stage. The remaining members are upgraded in the roll stage.

In many cluster configurations, you can roll multiple members in parallel and shorten the time required to upgrade the cluster. The number of members rolled in parallel is limited only by the requirement that the members not being rolled (plus the quorum disk, if one is configured) have sufficient votes to maintain quorum. Parallel rolls can be performed only after the lead member is rolled.

The clu_upgrade roll command performs the following tasks:

Verifies that the member is not the lead member, that the member has not already been rolled, and that the member is in single-user mode. Verifies that rolling the member will not result in a loss of quorum.

Backs up the member's member-specific files.

Sets up the it(8) scripts that will be run on reboot to perform the roll.

Reboots the member. During this boot, the it scripts roll the member, build a customized kernel, and reboot with the customized kernel.

Note

If you need to add a member to the cluster during a rolling upgrade, you must add the member from a member that has completed its roll.

If a member goes down (and cannot be repaired and rebooted) before all members have rolled, you must delete the member to complete the roll of the cluster. However, if you have rolled all members but one, and this member goes down before it has rebooted in the roll stage, you must delete this member and then reboot any other member of the cluster. (The clu_upgrade command runs during reboot and tracks the number of members rolled versus the number of members currently in the cluster; clu_upgrade marks the roll stage as completed when the two values are equal. That is why, in the case where you have rolled all members except one, deleting the unrolled member and rebooting another member completes the roll stage and lets you continue the rolling upgrade.)

7.7.7 Switch Stage

Command Where Run Run Level

clu_upgrade switch

any member

multiuser mode

All members must be up and running ^{[Footnote 13]}

The switch stage sets the active version of the software to the new version, which results in turning on any new features that had been deliberately disabled during the rolling upgrade. (See Section 7.9 for a description of active version and new version.)

The clu_upgrade switch command performs the following tasks:

Verifies that all members have rolled, that all members are running the same versions of the base operating system and cluster software, and that no members are running on tagged files.

Sets the new version ID in each member's sysconfigtab file and running kernel.

Sets the active version to the new version for all cluster members.

Note

After the switch stage completes, you must reboot each member of the cluster, one at a time.

7.7.8 Clean Stage

Command	Where Run	Run Level
`clu_upgrade clean`	any member	multiuser mode

The clean stage removes the tagged (.Old..) files from the cluster and completes the upgrade.

The clu_upgrade clean command performs the following tasks:

Verifies that the switch stage has completed, that all members are running the same versions of the base operating system and cluster software, and that no members are running on tagged files.

Removes all .Old.. files.

Removes any on-disk backup archives that clu_upgrade created.

If the directory exists, recursively deletes /var/adm/update/TruClusterKit, /var/adm/update/OSKit, and /var/adm/update/NHDKit.

If an update installation was performed, gives you the option of running the Update Administration Utility (updadmin) to manage the files that were saved during an update installation.

Creates an archive directory for this upgrade, /cluster/admin/clu_upgrade/history/release_version, and moves the clu_upgrade.log file to the archive directory.

7.8 Tagged Files

A rolling upgrade updates the software on one cluster member at a time. To support two versions of software within the cluster during a roll, clu_upgrade creates a set of tagged files in the setup stage.

A tagged file is a copy of a current file with .Old.. prepended to the copy filename, and an AdvFS property (DEC_VERSION_TAG) set on the copy. For example, the tagged file for the vdump command is named /sbin/.Old..vdump. Because tagged files are created in the same file system as the original files, you must have adequate free disk space before beginning a rolling upgrade.

Whether a member is running on tagged files is controlled by that member's sysconfigtab rolls_ver_lookup variable. The upgrade commands set the value to 1 when a member must run on tagged files, and to 0 when a member must not run on tagged files.

If a member's sysconfigtab rolls_ver_lookup attribute is set to 1, pathname resolution includes determining whether a specified filename has a .Old..filename copy and whether the copy has the DEC_VERSION_TAG property set on it. If both conditions are met, the requested file operation is transparently diverted to use the .Old..filename version of the file. Therefore, if the vdump command is issued on a member that has not rolled, the /sbin/.Old..vdump file is executed; if the command is issued on a member that has rolled, the /sbin/vdump file is executed. The only member that never runs on tagged files is the lead member (the first member to roll).

Note

File system operations on directories are not bound by this tagged file restraint. For example, an ls of a directory on any cluster member during a rolling upgrade lists both versions of a file. However, the output of an ls -ail command on a member that has not rolled is different from the output on a member that has rolled. In the following examples the ls -ail command is run first on a member that has not rolled and then on a member that has rolled. (The awk utility is used to print only the inode, size, month and day timestamp, and name of each file.)
The following output from the ls command is taken from a cluster member running with tags before it has rolled. The tagged files are the same as their untagged counterparts (same inode, size, and timestamp). When this member runs the hostname command, it runs the tagged version (inode 3643).
# cd /sbin
# ls -ail hostname .Old..hostname ls .Old..ls init .Old..init |\
awk '{printf("%d\t%d\t%s %s\t%s\n",$1,$6,$7,$8,$10)}`
 
 3643   16416    Aug 24   .Old..hostname
 3648   395600   Aug 24   .Old..init
 3756   624320   Aug 24   .Old..ls
 3643   16416    Aug 24   hostname
 3648   395600   Aug 24   init
 3756   624320   Aug 24   ls
 
The following output from the ls command is taken from a cluster member running without tags after it has rolled. The tagged files now differ from their untagged counterparts (different inode, size, and timestamp). When this member runs the hostname command, it runs the non-tagged version (inode 1370).
# cd /sbin
# ls -ail hostname .Old..hostname ls .Old..ls init .Old..init |\
awk '{printf("%d\t%d\t%s %s\t%s\n",$1,$6,$7,$8,$10)}`
 
 3643   16416    Aug 24   .Old..hostname
 3648   395600   Aug 24   .Old..init
 3756   624320   Aug 24   .Old..ls
 1187   16528    Mar 12   hostname
 1370   429280   Mar 12   init
 1273   792640   Mar 12   ls
 

After you create tagged files in the setup stage, we recommend that you run any administrative command, such as tar, from a member that has rolled. You can always run commands on the lead member because it never runs on tagged files.

The following rules determine which files have tagged files automatically created for them in the setup stage:

Tagged files are created for inventory files for the following product codes: base operating system (OSF), TruCluster Server (TCR), and Worldwide Language Support (IOS). (The subsets for each product use that product's three-letter product code as a prefix for each subset name. For example, TruCluster Server subset names start with the TruCluster Server three-letter product code: TCRBASE540, TCRMAN540, and TCRMIGRATE540.)

By default, files that are associated with other layered products do not have tagged files created for them. Tagged files are created only for layered products that have been modified to support tagged files during a rolling upgrade.

Caution

Unless a layered product's documentation specifically states that you can install a newer version of the product on the first rolled member, and that the layered product knows what actions to take in a mixed-version cluster, we strongly recommend that you do not install either a new layered product or a new version of a currently installed layered product during a rolling upgrade.

The clu_upgrade command provides several tagged command options to manipulate tagged files: check, add, remove, enable, and disable. When dealing with tagged files, take the following into consideration:

During a normal rolling upgrade you do not have to manually add or remove tagged files. The clu_upgrade command calls the tagged commands as needed to control the creation and removal of tagged files.

If you run a clu_upgrade tagged command, run the check, add, and remove commands on a member that is not running on tagged files; for example, the lead member. You can run the disable and enable commands on any member.

The target for a check, add, or remove tagged file operation is a product code that represents an entire product. The clu_upgrade tagged commands operate on all inventory files for the specified product or products. For example, the following command verifies the correctness of all the tagged files created for the TCR kernel layered product (the TruCluster Server subsets):
```
# clu_upgrade tagged check TCR
 
```
If you inadvertently remove a .Old.. copy of a file, you must create tagged files for the entire layered product to re-create that one file. For example, the vdump command is in the OSFADVFSxxx subset, which is part of the OSF product. If you mistakenly remove /sbin/.Old..vdump, run the following command to re-create tagged files for the entire layered product:
```
# clu_upgrade tagged add OSF
 
```

The enable and disable commands enable or disable the use of tagged files by a cluster member. You do not have to use enable or disable during a normal rolling upgrade.
The disable command is useful if you have to undo the setup stage. Because no members can be running with tagged files when undoing the setup stage, you can use the disable command to disable tagged files on any cluster member that is currently running on tagged files. For example, to disable tagged files for a member whose ID is 3:
```
# clu_upgrade tagged disable 3
 
```
The enable command is provided in case you make a mistake with the disable command.

7.9 Version Switch

A version switch manages the transition of the active version to the new version of an operating system. The active version is the one that is currently in use. The purpose of a version switch in a cluster is to prevent the introduction of potentially incompatible new features until all members have been updated. For example, if a new version introduces a change to a kernel structure that is incompatible with the current structure, you do not want cluster members to use the new structure until all members have updated to the version that supports it.

At the start of a rolling upgrade, each member's active version is the same as its new version. When a member rolls, its new version is updated. After all members have rolled, the switch stage sets the active version to the new version on all members. At the completion of the upgrade, all members' active versions are again the same as their new versions. The following simple example uses an active version of 1 and a new version of 2 to illustrate the version transitions during a rolling upgrade:

All members at start of roll:   active (1)  = new (1)
Each member after its roll:     active (1) != new (2)
All members after switch stage: active (2)  = new (2)

The clu_upgrade command uses the versw command, which is described in versw(8), to manage version transitions. The clu_upgrade command manages all the version switch activity when rolling individual members. In the switch stage, after all members have rolled, the following command completes the transition to the new software:

# clu_upgrade switch

7.10 Rolling Upgrade and Layered Products

This section discusses the interaction of layered products and rolling upgrades:

General guidelines (Section 7.10.1)

Blocking layered products (Section 7.10.2)

7.10.1 General Guidelines

The clu_upgrade setup command prepares a cluster for a rolling upgrade of the operating system. Do not use the setld command to load software onto the cluster between performing the clu_upgrade setup command and rolling the first cluster member to the new version. If you install software between performing the clu_upgrade setup command and rolling a cluster member to the new version, the new files will not have been processed by clu_upgrade setup. As a result, when you roll the first cluster member, these new files will be overwritten.

If you must load software:

Wait until at least one member has rolled.

Install the software on a member that has rolled.

7.10.2 Blocking Layered Products

A blocking layered product is a product that prevents the installupdate command from completing. Blocking layered products must be removed from the cluster before starting a rolling upgrade that will include running the installupdate command. You do not have to remove blocking layered products when performing a rolling upgrade solely to patch the cluster or install an NHD kit.

Table 7-6 lists blocking layered products for this release.

Table 7-6: Blocking Layered Products

Product Code	Description
3X0	Open3D
4DT	Open3D
ATM	Atom Advanced Developers Kit
DCE	Distributed Computing Environment
DNA	DECnet
DTA	Developer's Toolkit (Program Analysis Tools)
DTC	Developer's Toolkit (C compiler)
MME	Multimedia Services
O3D	Open 3D
PRX	PanoramiX Advanced Developers Kit

Notes

The three-letter product codes are the first three letters of subset names. For example, a subset named ATMBASExxx is part of the ATM product (Atom Advanced Developers Kit), which is a blocking layered product. However, a subset named OSFATMBINxxx contains the letters ATM, but the subset is not part of a blocking layered product; it is a subset in the OSF product (the base operating system).
When a blocking layered product is removed as part of the rolling upgrade, it is removed for all members. Any services that rely on the blocking product will not be available until the roll completes and the blocking layered product is reinstalled.

7.11 Rolling Upgrade and RIS

When performing the install stage of a rolling upgrade, you can load the base operating system subsets from a CD-ROM or from a Remote Installation Services (RIS) server.

Note

You can use RIS only to load the base operating system subsets.

To use RIS, you must register both the lead member and the default cluster alias with the RIS server. When registering for operating system software, you must provide a hardware address for each host name. Therefore, you must create a hardware address for the default cluster alias in order to register the alias with the RIS server. (RIS will reject an address that is already in either of the RIS server's /etc/bootptab or /var/adm/ris/clients/risdb files.)

If your cluster uses the cluster alias virtual MAC (vMAC) feature, register that virtual hardware address with the RIS server as the default cluster alias's hardware address. If your cluster does not use the vMAC feature, you can still use the algorithm that is described in the vMAC section of the Cluster Administration manual to manually create a hardware address for the default cluster alias.

A vMAC address consists of a prefix (the default is AA:01) followed by the IP address of the alias in hexadecimal format. For example, the default vMAC address for the default cluster alias deli whose IP address is 16.140.112.209 is AA:01:10:8C:70:D1. The address is derived in the following manner:

        Default vMAC prefix:       AA:01
        Cluster Alias IP Address:  16.140.112.209
        IP address in hex. format: 10.8C.70.D1
        vMAC for this alias:       AA:01:10:8C:70:D1

Another method for creating a hardware address is to append an arbitrary string of eight hexadecimal numbers to the default vMAC prefix, AA:01. For example, AA:01:00:00:00:00. Make sure that the address is unique within the area served by the RIS server. If you have more than one cluster, remember to increment the arbitrary hexadecimal string when adding the next alias. (The vMAC algorithm is useful because it creates an address that has a high probability of being unique within your network.)