A rolling upgrade is a software upgrade of a cluster that is performed while the cluster is in operation. One member at a time is upgraded and returned to operation while the cluster transparently maintains a mixed-version environment for the base operating system, cluster, and Worldwide Language Support (WLS) software. Clients accessing services are not aware that a rolling upgrade is in progress.
A rolling upgrade consists of an ordered series of steps, called stages. The commands that control a rolling upgrade enforce this order.
This first part of the chapter contains instructions for performing a rolling upgrade, for displaying the status of a rolling upgrade, and for undoing one or more stages of a rolling upgrade. Those interested in how a rolling upgrade works can find the details in Section 7.6 and the sections that follow it.
This chapter discusses the following topics:
Tasks, and combinations of tasks, you can perform during a single rolling upgrade (Section 7.1)
Tasks you cannot perform during a rolling upgrade (Section 7.2)
How to perform a rolling upgrade (Section 7.3)
How to display the status of a rolling upgrade (Section 7.4)
How to undo the stages of a rolling upgrade (Section 7.5)
The commands used during a rolling upgrade (Section 7.6)
Rolling upgrade stages (Section 7.7)
Two mechanisms that support rolling upgrades: tagged files (Section 7.8) and version switches (Section 7.9)
Rolling upgrade and layered products (Section 7.10)
Rolling upgrade and RIS (Section 7.11)
Figure 7-1
provides a simplified flow chart
of the tasks and stages that are part of a rolling upgrade initiated
on a Version 5.1B cluster:
Figure 7-1: Rolling Upgrade Flow Chart
7.1 Rolling Upgrade Supported Tasks
The tasks that you can perform during a rolling upgrade depend on which versions of the base operating system and cluster software are currently running on the cluster. The main focus of this chapter is to describe the behavior of a rolling upgrade that starts on a TruCluster Server Version 5.1B cluster. However, because you may read this chapter in preparation for a rolling upgrade from TruCluster Server Version 5.1A to Version 5.1B, we point out rolling upgrade differences between the two versions.
The following list describes the basic tasks you can perform within a rolling upgrade:
Upgrade the cluster's Tru64 UNIX base operating system and TruCluster Server software. You perform this type of rolling upgrade to upgrade from the installed version to the next version.
When performing a rolling upgrade of the base operating system and cluster software, you can roll only from one version to the next version. You cannot skip versions. See Table 1-1 for a list of supported upgrade paths.)
Note
A rolling upgrade updates the file systems and disks that the cluster currently uses. The roll does not update the disk or disks that contain the Tru64 UNIX operating system used to create the cluster (the operating system on which you ran
clu_create
). Although you can boot the original operating system in an emergency when the cluster is down, remember that the differences between the current cluster and the original operating system increase with each cluster update.
Patch the cluster's current versions of the Tru64 UNIX base operating system and TruCluster Server software.
Install a New Hardware Delivery (NHD) kit (the cluster must be running TruCluster Server Version 5.1A or later).
Rolling in a patch kit or an NHD kit uses the same procedure as rolling in a new release of the base operating system and cluster software. The difference is which commands you run during the install stage:
To upgrade the base operating system and cluster
software, run
installupdate
in the install
stage.
To roll in a patch kit, run
dupatch
in the
install stage.
You can invoke
dupatch
multiple
times in the install stage to roll in multiple patch kits.
If you want to perform a no-roll patch of the cluster, do not run
the
clu_upgrade
command.
Instead run the
dupatch
command from a cluster member running
in multiuser mode.
No-roll patching applies patches quickly and reduces the number of reboots required. It patches the cluster in one operation. However, it requires a reboot of the whole cluster to complete the operation, so the cluster is unavailable for a period. For more information, see the Patch Kit Installation Instructions that came with the patch kit you want to install.
To install an NHD kit, run
nhd_install
in the install stage.
Throughout this chapter, the term rolling upgrade refers to the overall procedure used to roll one or more software kits into a cluster.
As shown in Figure 7-1, you can perform more than one task during a rolling upgrade.
If the cluster is running Version 5.1A or Version 5.1B,
a rolling upgrade can include the task combinations listed in
Table 7-1:
Table 7-1: Rolling Upgrade Tasks Supported by Version 5.1A and Version 5.1B
An update installation from Version 5.1A to Version 5.1B An update installation from Version 5.1B to the next release |
A patch of Version 5.1A A patch of Version 5.1B |
The installation of a New Hardware Delivery (NHD) kit onto a Version 5.1A cluster The installation of an NHD kit onto a Version 5.1B cluster |
An update installation from Version 5.1A to Version 5.1B of the base operating system and cluster software, followed by a patch of Version 5.1B An update installation from Version 5.1B to the next release of the base operating system and cluster software followed by a patch of the next release [Footnote 5] |
An NHD installation onto a Version 5.1A cluster followed by a patch of Version 5.1A An NHD installation onto a Version 5.1B cluster followed by a patch of Version 5.1B |
An update installation from Version 5.1A to Version 5.1B followed by the installation of an NHD kit for Version 5.1B An update installation from Version 5.1B to the next release of the base operating system and cluster software followed by the installation of an NHD kit for that next release [Footnote 6] |
An update installation from Version 5.1A to Version 5.1B, followed by the installation of an NHD kit for Version 5.1B, followed by a patch of Version 5.1B An update installation from Version 5.1B to the next release, followed by the installation of an NHD kit for the next release, followed by a patch of the next release [Footnote 6] |
The following list describes tasks that you cannot perform or that we recommend you do not attempt during a rolling upgrade:
Do not remove or modify files in the
/var/adm/update
directory.
The files in this directory are critical to the roll.
Removing them can cause a rolling upgrade to fail.
During the install stage, you cannot run a
dupatch
command
followed by an
installupdate
command.
To patch the current
software before you perform a rolling upgrade, you must perform
two complete rolling upgrade operations: one to patch the current
software, and one to perform the update installation.
You cannot bypass versions when performing a rolling upgrade of the base operating system and cluster software. You can only roll from one version to the next version. For supported upgrade paths, see Table 1-1.
Do not use the
/usr/sbin/setld
command to add
or delete any of the following subsets:
Base Operating System subsets (those with the prefix
OSF
).
TruCluster Server subsets (those with the prefix
TCR
).
Worldwide Language Support (WLS) subsets (those with the prefix
IOSWW
).
Adding or deleting these subsets during a roll creates inconsistencies in the tagged files.
Do not install a layered product during the roll.
Unless a layered product's documentation specifically states that you can install a newer version of the product on the first rolled member, and that the layered product knows what actions to take in a mixed-version cluster, we strongly recommend that you do not install either a new layered product or a new version of a currently installed layered product during a rolling upgrade.
For more information about layered products and rolling upgrades, see Section 7.10.
In the procedure in this section, unless otherwise stated, run commands in multiuser mode. Each step that corresponds to a stage refers to the section that describes that stage in detail. We recommend that you read the detailed description of stages in Section 7.7 before performing the rolling upgrade procedure.
Some stages of a rolling upgrade take longer to complete than
others.
Table 7-2
lists the
approximate time it takes to complete each stage.
Table 7-2: Time Estimates for Rolling Upgrade Stages
Stage | Duration |
Preparation | Not under program control. |
Setup | 45 - 120 minutes. [Footnote 7] |
Preinstall | 15 - 30 minutes. [Footnote 7] |
Install | The same amount of time it takes to
run
installupdate ,
dupatch ,
nhd_install , or a supported combination of these commands
on a single system. |
Postinstall | Less than 1 minute. |
Roll (per member) | Patch: less than 5 minutes. Update installation: about the same amount of time it takes to add a member. [Footnote 8] |
Switch | Less than 1 minute. |
Clean | 30 - 90 minutes. [Footnote 7] |
You can use the following procedure to upgrade a TruCluster Server Version 5.1A cluster to Version 5.1B, and to upgrade a cluster that is already at Version 5.1B.
Prepare the cluster for the rolling upgrade (Section 7.7.1):
Choose one cluster member to be the lead member (the first member
to roll).
(The examples in this procedure use a member whose
memberid
is
2
as the lead
member.
The example member's hostname is
provolone
.)
Back up the cluster.
If you will perform an update installation during the install stage, remove any blocking layered products, listed in Table 7-6, that are installed on the cluster.
To determine whether the cluster is ready for an upgrade, run the
clu_upgrade -v check setup
lead_memberid
command on any
cluster member.
For example:
# clu_upgrade -v check setup 2
If a file system needs more free space, use AdvFS utilities such as
addvol
to add volumes to domains as needed.
For
disk space requirements, see
Section 7.7.1.
For information on managing AdvFS
domains, see the Tru64 UNIX
AdvFS Administration
manual.
Verify that each system's firmware will support the new software. Update firmware as needed before starting the rolling upgrade.
Perform the setup stage (Section 7.7.2).
Notes
If your current cluster is at Version 5.1A or later and if you plan to upgrade the base operating system and cluster software during the install stage, mount the device or directory that contains the new TruCluster Server kit before running
clu_upgrade setup
. Thesetup
command will copy the kit to the/var/adm/update/TruClusterKit
directory.If your current cluster is at Version 5.1A or later and if you plan to install an NHD kit during the install stage, mount the device or directory that contains the new NHD kit before running
clu_upgrade setup
. Thesetup
command will copy the kit to the/var/adm/update/NHDKit
directory.
On any member, run the
clu_upgrade setup
lead_memberid
command.
For
example:
# clu_upgrade setup 2
Section 7.7.2
shows the menu displayed by
the
clu_upgrade
command.
When the setup stage is completed,
clu_upgrade
prompts you to reboot all cluster members except the lead member.
One at a time, reboot all cluster members except the lead member. Do not start the preinstall stage until these members are either rebooted or halted.
Perform the preinstall stage (Section 7.7.3).
On the lead member, run the following command:
# clu_upgrade preinstall
If your current cluster is at Version 5.1A or later, the
preinstall
command gives you the option of
verifying or not verifying the existence of the tagged files created
during the setup stage.
If you have just completed the setup stage and have done nothing to cause the deletion any of the tagged files, you can skip this test.
If you completed the setup stage a while ago and are not sure what to
do, let
preinstall
test the correctness of the
tagged files.
Perform the install stage (Section 7.7.4).
Note
During the install stage you load the new software on the lead member, in effect rolling that member. When you perform the roll stage, this new software is propagated to the remaining members of the cluster.
The
clu_upgrade
command does not load software during the install stage. The loading of software is controlled by the commands you run:installupdate
,dupatch
, ornhd_install
.See Table 7-1 for the list of rolling upgrade tasks and combination of tasks supported for Version 5.1A and Version 5.1B.
See the Tru64 UNIX
Installation Guide
for detailed information on
using the
installupdate
command.
See the Tru64 UNIX and TruCluster Server
Patch Kit Installation Instructions
that came
with your patch kit for detailed information on using the
dupatch
command.
See the Tru64 UNIX
New Hardware Delivery Release Notes and Installation Instructions
that came
with your NHD kit for detailed information on using the
nhd_install
command.
If the software you are installing requires that its installation command be run from single-user mode, halt the system and boot the system to single-user mode:
# shutdown -h now >>> boot -fl s
Note
Halting and booting the system ensures that it provides the minimal set of services to the cluster and that the running cluster has a minimal reliance on the member running in single-user mode. In particular, halting the member satisfies services that require the cluster member to have a status of DOWN before completing a service failover. If you do not first halt the cluster member, services will probably not fail over as expected.
When the system reaches single-user mode, run the following commands:
# init s # bcheckrc # lmf reset
Run the
installupdate
,
dupatch
,
or
nhd_install
command.
To roll in multiple patch kits, you can invoke
dupatch
multiple times in a single install
stage.
Be aware that doing so
may make it difficult to isolate problems should any arise
after the patch process is completed and the cluster is in use.
You cannot run a
dupatch
command followed
by an
installupdate
command.
To patch the
current software before you perform a rolling upgrade, you must
perform two complete rolling upgrade operations: one to patch
the current software, and one to perform the update installation.
(Optional) After the lead member performs its final reboot with its new custom kernel, you can perform the following manual tests before you roll any additional members:
Verify that the newly rolled lead member can serve the shared root
(/
) file system.
Use the
cfsmgr
command to determine which cluster
member is currently serving the root file system.
For example:
# cfsmgr -v -a server / Domain or filesystem name = / Server Name = polishham Server Status : OK
Relocate the root (/
) file system to the
lead member.
For example:
# cfsmgr -h polishham -r -a SERVER=provolone /
Verify that the lead member can serve applications to clients. Make sure that the lead member can serve all important applications that the cluster makes available to its clients.
You decide how and what to test. We suggest that you thoroughly exercise critical applications and satisfy yourself that the lead member can serve these applications to clients before continuing the roll. For example:
Manually relocate CAA services to the lead member.
For example, to
relocate the application resource named
cluster_lockd
to lead member
provolone
:
# caa_relocate cluster_lockd -c provolone
Temporarily modify the default cluster alias selection priority
attribute,
selp
, to force the lead member to serve
all client requests directed to that alias.
For example:
# cluamgr -a alias=DEFAULTALIAS,selp=100
The lead member is now the end recipient for all connection requests and packets addressed to the default cluster alias.
From another member or from an outside client, use services such as
telnet
and
ftp
to verify that
the lead member can handle alias traffic.
Test client access to all
important services that the cluster provides.
When you are satisfied, reset the alias attributes on the lead member to their original values.
Perform the postinstall stage (Section 7.7.5).
On the lead member, run:
# clu_upgrade postinstall
Perform the roll stage (Section 7.7.6).
Roll the members of the cluster that have not already rolled. [Footnote 9]
You can roll multiple members simultaneously (parallel roll), subject to the restriction that the number of members not being rolled (plus the quorum disk, if one is configured) is sufficient to maintain cluster quorum.
To roll a member, do the following:
Halt the member system and boot it to single-user mode. For example:
# shutdown -h now >>> boot -fl s
When the system reaches single-user mode, run the following commands:
# init s # bcheckrc # lmf reset
Roll the member:
# clu_upgrade roll
If you are performing parallel rolls, use the
-f
option with the
clu_upgrade roll
command.
This option
causes the member to automatically reboot without first prompting
for permission:
# clu_upgrade -f roll
The roll command verifies that rolling the member will not result in a loss of quorum. If a loss of quorum will result, then the roll of the member does not occur and an error message is displayed. You can roll the member later, after one of the currently rolling members has rejoined the cluster and its quorum vote is available.
If the roll proceeds, the member is prepared for a reboot.
If you used the
-f
option,
no prompt is displayed; the reboot occurs
automatically.
If you did not use the
-f
option,
clu_upgrade
displays a prompt that asks whether you
want to reboot at this time.
Unless you want to examine
something specific before you reboot, enter
yes.
(If you enter
yes, it may take approximately half a
minute before the actual reboot occurs.)
Perform parallel rolls to minimize the time needed to complete the roll stage. For example, on an eight-member cluster with a quorum disk, after rolling the lead member, you can roll four members in parallel.
Begin the roll stage on a member. (The lead member was rolled during the install stage. You do not perform the roll stage on the lead member.)
When you see a message similar to the following, begin the roll stage on the next member:
*** Info *** You may now begin the roll of another cluster member.
If you see a message that begins like the following, it is probably caused by the number of currently rolling members that contribute member votes.
*** Info *** The current quorum conditions indicate that beginning a roll of another member at this time may result in the loss of quorum.
In this case, you have the following options:
You can wait until a member completes the roll stage before you begin to roll the next member.
If there is an unrolled member that does not contribute member votes, you can begin the roll stage on it.
Continue to roll members until all members of the cluster have rolled. Before starting each roll stage, wait until you see the message that it is all right to do so.
When you roll the last member, you will see a message similar to the following:
*** Info *** This is the last member requiring a roll.
Note
The roll actually takes place during the reboot. The
clu_upgrade roll
command sets up thescripts that will be run during the reboot. When you reboot, the it
(8)it
scripts roll the member, build a customized kernel, and then reboot again so the member will be running on its new customized kernel. When the member boots its new customized kernel, it has completed its roll and is no longer running on tagged files.
Perform the switch stage (Section 7.7.7).
After all members have rolled, run the
switch
command on any member.
# clu_upgrade switch
One at a time, reboot each member of the cluster.
Perform the clean stage (Section 7.7.8).
Run the following command on any member to remove the tagged
(.Old..
) files from the cluster and complete the
upgrade.
# clu_upgrade clean
7.4 Displaying the Status of a Rolling Upgrade
The
clu_upgrade
command provides the following
options for displaying the status of a rolling upgrade.
You can
run status commands at any time.
To display the overall status of a rolling upgrade:
clu_upgrade -v
or
clu_upgrade -v status
.
To determine whether you can run a stage:
clu_upgrade check
[stage]
.
If you do not specify a
stage,
clu_upgrade
tests whether the next stage can be run.
To determine whether a stage has started or completed:
clu_upgrade started
stage
or
clu_upgrade completed
stage
.
To determine whether a member has rolled:
clu_upgrade check roll
memberid
.
To verify whether tagged files have been created for a layered
product:
clu_upgrade tagged check
[prod_code
[prod_code
...]]
.
If you do not
specify a product code,
clu_upgrade
inspects all
tagged files in the cluster.
Notes
During a roll, there might be two versions of the
clu_upgrade
command in the cluster an older version used by members that have not yet rolled, and a newer version (if included in the update distribution or patch kit). The information that is displayed by thestatus
command might differ depending on whether the command is run on a member that has rolled. Therefore, if you run thestatus
command on two members, do not be surprised if the format of the displayed output is not the same.If you run
clu_upgrade status
after runninginstallupdate
,clu_upgrade
will display a message indicating that the install stage is complete. However, the install stage is not really complete until you run theclu_upgrade postinstall
command.
The
clu_upgrade undo
command provides the ability to
undo a rolling upgrade that has not completed the switch
stage.
You can undo any stage except the switch stage and the clean
stage.
You must undo stages in order; for example, if you decide to
undo a rolling upgrade after completing the preinstall stage, you undo
the preinstall stage and then undo the setup stage.
Note
Before undoing any stage, we recommend that you read the relevant version of the Cluster Release Notes to determine whether there are restrictions related to the undoing of any stage.
To undo a stage, use the
undo
command with the
stage that you want to undo.
The
clu_upgrade
command
determines whether the specified stage is a valid stage to undo.
Table 7-3
outlines the requirements for undoing a
stage:
Table 7-3: Undoing a Stage
Stage to Undo | Command | Comments |
Setup | clu_upgrade undo setup |
You must run this command on the lead member. In addition, no members can be running on tagged files when you undo the setup stage. Before you undo the setup
stage, use the
When no members are
running on tagged files, run the
|
Preinstall | clu_upgrade undo preinstall |
You must run this command on the lead member. |
Install | clu_upgrade undo
install |
You can run this command on any member except the lead member. Halt the lead member.
Then run the
|
Postinstall | clu_upgrade undo
postinstall |
You must run this command on the lead member. |
Roll | clu_upgrade undo roll
memberid |
You can run this command on any member except the member whose roll stage will be undone. Halt the member whose roll stage is
being undone.
Then run the
|
The
clu_upgrade
command, described in
clu_upgrade
(8)installupdate
,
dupatch
, or
nhd_install
to load and install software.
These
commands are rolling upgrade aware; they are modified to understand
which actions they are allowed to take during the install and roll
stages of a rolling upgrade.
When you start a rolling upgrade, the cluster is running the software
from the previous release.
For the first part of any rolling upgrade,
you are running the
clu_upgrade
command that is
already installed on the cluster.
If a new version is installed during
the rolling upgrade, there may be minor differences in the on-screen
display and behavior between the two versions of the command.
The following two tables show at which stages during a rolling upgrade new versions of upgrade commands, if shipped with the kits being installed, become available during a rolling upgrade: [Footnote 10]
Table 7-4 maps commands to stages for a rolling upgrade from Version 5.1A to Version 5.1B, a patch kit, or an NHD kit; or to Version 5.1B of the base operating system and cluster software followed by a patch of the new software within the same rolling upgrade.
Table 7-5 maps commands to stages for a rolling upgrade from Version 5.1B to the next release of the operating system and cluster software, a Version 5.1B patch kit, or an NHD kit; or to the next release of the base operating system and cluster software followed by a patch of the new software within the same rolling upgrade.
Table 7-4: Stages and clu_upgrade Versions When Performing a Rolling Upgrade from Version 5.1A
Stage | Version 5.1A | Next Release [Footnote 11] | Comments |
Preparation | X | The currently installed (old)
version of
clu_upgrade
is always run in this stage. |
|
Setup | X | The currently installed (old)
version of
If performing an update
installation, the new version of the
|
|
Preinstall | X | If the rolling upgrade includes an
update installation, all members use the new version of
clu_upgrade
installed during the setup
stage.
(Otherwise, members continue to run the current version of
clu_upgrade .) |
|
Install | X | If the rolling upgrade includes an
update installation, all members use the version of
During the update installation, a new version of
A
patch kit always installs the latest version of
If performing a patch, and if the patch kit includes a new version of
|
|
Postinstall | X | If a new version of
clu_upgrade
was installed in either the setup stage
or the install stage, all members use the new version. |
|
Roll | X | If a new version of
clu_upgrade
was installed in either the setup stage
or the install stage, all members use the new version. |
|
Switch | X | If a new version of
clu_upgrade
was installed in either the setup stage or
the install stage, all members use the new version. |
|
Clean | X | If a new version of
clu_upgrade
was installed in either the setup stage or
the install stage, all members use the new version. |
Table 7-5: Stages and clu_upgrade Versions When Performing a Rolling Upgrade from Version 5.1B
Stage | Version 5.1B | Next Release [Footnote 12] | Comments |
Preparation | X | The currently installed (old)
version of
clu_upgrade
is always run in this stage. |
|
Setup | X | The currently installed (old)
version of
If performing an update
installation, the new version of the
|
|
Preinstall | X | If the rolling upgrade includes an
update installation, all members use the new version of
clu_upgrade
installed during the setup
stage.
(Otherwise, members continue to run the current version of
clu_upgrade .) |
|
Install | X | If the rolling upgrade includes an
update installation, all members use the version of
During the update installation, a new version of
A
patch kit always installs the latest version of
If performing a patch, and if the patch kit includes a new version of
|
|
Postinstall | X | If a new version of
clu_upgrade
was installed in either the setup stage
or the install stage, all members use the new version. |
|
Roll | X | If a new version of
clu_upgrade
was installed in either the setup stage
or the install stage, all members use the new version. |
|
Switch | X | If a new version of
clu_upgrade
was installed in either the setup stage or
the install stage, all members use the new version. |
|
Clean | X | If a new version of
clu_upgrade
was installed in either the setup stage or
the install stage, all members use the new version. |
The following sections describe each of the rolling upgrade stages.
Note
These sections only describe the stages. Use the procedure in Section 7.3 to perform a rolling upgrade.
Preparation stage (Section 7.7.1)
Setup stage (Section 7.7.2)
Preinstall stage (Section 7.7.3)
Install stage (Section 7.7.4)
Postinstall stage (Section 7.7.5)
Roll stage (Section 7.7.6)
Switch stage (Section 7.7.7)
Clean stage (Section 7.7.8)
Command | Where Run | Run Level |
clu_upgrade -v check setup
lead_memberid |
any member | multiuser mode |
During the preparation stage, you back up all important cluster data and verify that the cluster is ready for a roll. Before beginning a rolling upgrade, do the following:
Choose one member of the cluster as the first member to roll.
This
member, known as the lead member, must
have direct access to the root (/
),
/usr
,
/var
, and, if used,
i18n
file systems.
Make sure that the lead member can run any critical applications. You can test these applications after you update this member during the install stage, but before you roll any other members. If a problem occurs, you can try to resolve it on this member before you continue. If you cannot resolve a problem, you can undo the rolling upgrade and return the cluster to its pre-roll state. (Section 7.5 describes how to undo rolling upgrade stages.)
Back up the clusterwide root (/
),
/usr
, and
/var
file
systems, including all member-specific files in these
file systems.
If the cluster has a separate
i18n
file system, back up that file system.
In addition, back up any other
file systems that contain critical user or application data.
Note
If you perform an incremental or full backup of the cluster during a rolling upgrade, make sure to perform the backup on a member that is not running on tagged files. If you back up from a member that is using tagged files, you will only back up the contents of the
.Old..
files. Because the lead member never uses tagged files, you can back up the cluster from the lead member (or any other member that has rolled) during a rolling upgrade.Most sites have automated backup procedures. If you know that an automatic backup will take place while the cluster is in the middle of a rolling upgrade, make sure that backups are done on the lead member or on a member that has rolled.
If you plan to run the
installupdate
command in the
install stage, remove any blocking layered products listed in
Table 7-6
that are installed on the
cluster.
Run the
clu_upgrade -v check setup
lead_memberid
command, which
verifies the following information:
No rolling upgrade is in progress.
All members are running the same versions of the base operating system and cluster software.
No members are running on tagged files.
There is adequate free disk space.
Verify that each system's firmware will support the new software. Update firmware as needed before starting the rolling upgrade.
A cluster can continue to operate during a rolling upgrade because
two copies exist of the operating system and cluster software
files.
(Only one copy exists of shared configuration files so that
changes made by any member are visible to all members.) This approach
makes it possible to run two different versions of the base operating
system and the cluster software at the same time in the same
cluster.
The trade-off is that, before you start an upgrade, you must
make sure that there is adequate free space in each of the clusterwide
root (/
),
/usr
, and
/var
file systems, and, if a separate domain
exists for the Worldwide Language Support (WLS) subsets, in
the
i18n
file system.
A rolling upgrade has the following disk space requirements:
At least 50 percent free space in root (/
),
cluster_root#root
.
At least 50 percent free space in
/usr
,
cluster_usr#usr
.
At least 50 percent free space in
/var
,
cluster_var#var
, plus, if updating the operating
system, an additional 425 MB to hold the subsets for the new version
of the base operating system.
If a separate
i18n
domain exists for the WLS
subsets, at least 50 percent free space in that domain.
No tagged files are placed on member boot partitions. However, programs might need free space when moving kernels to boot partitions. We recommend that you reserve at least 50 MB free space on each member's boot partition.
Note
You cannot use the
addvol
command to add volumes to a member's root domain (the a partition on the member's boot disk). Instead, you must delete the member from the cluster, usediskconfig
or SysMan to configure the disk appropriately, and then add the member back into the cluster.
If installing a patch kit, see the Patch Kit Installation Instructions that came with your patch kit to find the amount of space you will need to install that kit. If installing an NHD kit, see the New Hardware Delivery Release Notes and Installation Instructions that came with your NHD kit to find the amount of space you will need to install that kit.
If a file system needs more free space, use AdvFS utilities such as
addvol
to add volumes to domains as needed.
For
information on managing AdvFS domains, see the Tru64 UNIX
AdvFS Administration
manual.
(The AdvFS Utilities require a separate license.)
You can also expand the clusterwide root (/
)
domain.
Note
The
clu_upgrade
command verifies whether sufficient space exists at the start of a rolling upgrade. However, nothing prevents a cluster member from consuming disk space during a rolling upgrade, thus creating a situation where a later stage might not have enough disk space.Disk space is dynamic. If you know that a member will be consuming disk space during a rolling upgrade, add additional space before you start the upgrade.
Command | Where Run | Run Level |
clu_upgrade setup
lead_memberid |
any member | multiuser mode |
The setup stage performs the
clu_upgrade check
setup
command, creates tagged files, and prepares the cluster
for the roll.
The
clu_upgrade setup
lead_memberid
command performs the
following tasks:
Creates the rolling upgrade log file,
/cluster/admin/clu_upgrade.log
.
(Section D.3
contains a sample
clu_upgrade.log
file.)
Makes the
-v check setup
tests listed in
Section 7.7.1.
Prompts you to indicate whether to perform an update installation,
install a patch kit, install an NHD kit, or a combination thereof.
The following example shows the menu displayed by the TruCluster Server
Version 5.1B
clu_upgrade
command:
What type of rolling upgrade will be performed? Selection Type of Upgrade --------------------------------------------------------------- 1 An upgrade using the installupdate command 2 A patch using the dupatch command 3 A new hardware delivery using the nhd_install command 4 All of the above 5 None of the above 6 Help 7 Display all options again --------------------------------------------------------------- Enter your Choices (for example, 1 2 2-3):
If you specify an update installation, copies the relevant kits onto disk:
If performing an update installation, copies the cluster kit to
/var/adm/update/TruClusterKit
so that the kit
will be available to the
installupdate
command
during the install stage.
(The
installupdate
command copies the operating system kit to
/var/adm/update/OSKit
during the install stage.)
The
clu_upgrade
command prompts for the absolute
pathname for the TruCluster Server kit location.
On a TruCluster Server
Version 5.1B cluster, when performing a rolling upgrade that
includes an update installation, remember to mount the TruCluster Server
kit before running the
clu_upgrade setup
command.
On a TruCluster Server Version 5.1B cluster, if performing an NHD
installation, uses the
nhd_install
command to copy
the NHD kit to
/var/adm/update/NHDKit
Caution
The files in
/var/adm/update
are critical to the roll process. Do not remove or modify files in this directory. Doing so can cause a rolling upgrade to fail.
Creates the mandatory set of tagged files for the
OSF
(base),
TCR
(cluster), and
IOS
(Worldwide Language Support) products.
Caution
If, for any reason, during an upgrade you need to create tagged files for a layered product, see Section 7.8.
Sets the
sysconfigtab
variable
rolls_ver_lookup=1
on all members except the lead
member.
When
rolls_ver_lookup=1
, a member uses
tagged files.
As a result, the lead member can upgrade while the
remaining members run on the
.Old..
files from the
current release.
Prompts you to reboot all cluster members except the lead member.
When
the
setup
command completes, reboot these members
one at a time so that the cluster can maintain quorum.
This reboot is
required for each member that will use tagged files in the
mixed-version cluster.
When the reboots complete, all members except
the lead member are running on tagged files.
Command | Where Run | Run Level |
clu_upgrade preinstall |
lead member | multiuser mode |
The purpose of the preinstall stage is to verify that the cluster is
ready for the lead member to run one or more of the
installupdate
,
dupatch
, or
nhd_install
commands.
The
clu_upgrade preinstall
command performs the
following tasks:
Verifies that the command is being run on the lead member, that the lead member is not running on tagged files, and that any other cluster members that are up are running on tagged files.
(Optional) Verifies that tagged files are present, that they match their product's inventory files, and that each tagged file's AdvFS property is set correctly. (This process can take a while, but not as long as it does to create the tagged files in the setup stage. Table 7-2 provides time estimates for each stage.)
Makes on-disk backup copies of the lead member's member-specific files.
Command | Where Run | Run Level |
installupdate |
lead member | single-user mode |
dupatch |
lead member | single-user or multiuser mode |
nhd_install |
lead member | single-user mode |
If your current cluster is running TruCluster Server Version 5.1B or Version 5.1A, you can perform one of the tasks or combinations of tasks listed in Table 7-1.
The install stage starts when the
clu_upgrade
preinstall
command completes, and continues until you run
the
clu_upgrade postinstall
command.
Note
If you run
clu_upgrade status
after runninginstallupdate
,clu_upgrade
displays a message indicating that the install stage is complete. However, the install stage is not really complete until you run theclu_upgrade postinstall
command.
The lead member must be in single-user mode to run the
installupdate
command or the
nhd_install
command; single-user mode is
recommended for the
dupatch
command.
When taking
the system to single-user mode, you must halt the system and then boot
it to single-user mode.
When the system is in single-user mode, run the
init s
,
bcheckrc
,
and
lmf reset
commands
before you run the
installupdate
,
dupatch
, or
nhd_install
commands.
See the Tru64 UNIX
Installation Guide, the Tru64 UNIX and TruCluster Server
Patch Kit Installation Instructions, and the Tru64 UNIX
New Hardware Delivery Release Notes and Installation Instructions
for information
on how to use these commands.
Notes
You can run the
dupatch
command multiple times in order to install multiple patches. Doing so may make isolating problems difficult if any arise after the patch process is completed and the cluster is in use.During the install stage, you cannot run a
dupatch
command followed by aninstallupdate
command. To patch the current software before you perform a rolling upgrade, you must perform two complete rolling upgrade operations: one to patch the current software, and one to perform the update installation.If an NHD installation is part of a rolling upgrade that includes an update installation, you do not have to manually run
nhd_install
; theinstallupdate
command will install the NHD kit. Otherwise, use thenhd_install
command copied byclu_upgrade
during the setup stage:/var/adm/update/NHDKit/nhd_install
.
Command | Where Run | Run Level |
clu_upgrade postinstall |
lead member | multiuser mode |
The postinstall stage verifies that the lead member has completed an
update installation, a patch, or an NHD installation.
If an update
installation was performed,
clu_upgrade postinstall
verifies that the lead member has rolled to the new version of the
base operating system.
7.7.6 Roll Stage
Command | Where Run | Run Level |
clu_upgrade roll |
member being rolled | single-user mode |
The lead member was upgraded in the install stage. The remaining members are upgraded in the roll stage.
In many cluster configurations, you can roll multiple members in parallel and shorten the time required to upgrade the cluster. The number of members rolled in parallel is limited only by the requirement that the members not being rolled (plus the quorum disk, if one is configured) have sufficient votes to maintain quorum. Parallel rolls can be performed only after the lead member is rolled.
The
clu_upgrade roll
command performs the following
tasks:
Verifies that the member is not the lead member, that the member has not already been rolled, and that the member is in single-user mode. Verifies that rolling the member will not result in a loss of quorum.
Backs up the member's member-specific files.
Sets up the
it
(8)
Reboots the member.
During this boot, the
it
scripts roll the member, build a customized kernel, and reboot with
the customized kernel.
Note
If you need to add a member to the cluster during a rolling upgrade, you must add the member from a member that has completed its roll.
If a member goes down (and cannot be repaired and rebooted) before all members
have rolled, you must delete the member to complete the roll of the cluster.
However, if you have rolled all members but one, and this member goes down
before it has rebooted in the roll stage, you must delete this member and then
reboot any other member of the cluster.
(The
clu_upgrade
command runs during reboot and tracks
the number of members rolled versus the number of members currently in
the cluster;
clu_upgrade
marks the roll stage as
completed when the two values are equal.
That is why, in the case
where you have rolled all members except one, deleting the unrolled member
and rebooting another member completes the roll stage and lets you
continue the rolling upgrade.)
7.7.7 Switch Stage
Command | Where Run | Run Level |
clu_upgrade switch |
any member | multiuser mode All members must be up and running [Footnote 13] |
The switch stage sets the active version of the software to the new version, which results in turning on any new features that had been deliberately disabled during the rolling upgrade. (See Section 7.9 for a description of active version and new version.)
The
clu_upgrade switch
command performs the
following tasks:
Verifies that all members have rolled, that all members are running the same versions of the base operating system and cluster software, and that no members are running on tagged files.
Sets the new version ID in
each member's
sysconfigtab
file and running kernel.
Sets the active version to the new version for all cluster members.
Note
After the switch stage completes, you must reboot each member of the cluster, one at a time.
Command | Where Run | Run Level |
clu_upgrade clean |
any member | multiuser mode |
The clean stage removes the tagged (.Old..
) files
from the cluster and completes the upgrade.
The
clu_upgrade clean
command performs the
following tasks:
Verifies that the switch stage has completed, that all members are running the same versions of the base operating system and cluster software, and that no members are running on tagged files.
Removes all
.Old..
files.
Removes any on-disk backup archives that
clu_upgrade
created.
If the directory exists, recursively deletes
/var/adm/update/TruClusterKit
,
/var/adm/update/OSKit
, and
/var/adm/update/NHDKit
.
If an update installation was performed, gives you the option of
running the Update Administration Utility
(updadmin
) to manage the files that were saved
during an update installation.
Creates an archive directory for this upgrade,
/cluster/admin/clu_upgrade/history/release_version
,
and moves the
clu_upgrade.log
file to
the archive directory.
A rolling upgrade updates the software on one cluster member
at a time.
To support two versions of software within
the cluster during a roll,
clu_upgrade
creates a
set of tagged files in the setup stage.
A tagged file is a copy of a current file with
.Old..
prepended to the copy filename, and an AdvFS
property (DEC_VERSION_TAG
) set on the copy.
For
example, the tagged file for the
vdump
command is named
/sbin/.Old..vdump
.
Because tagged files are created in
the same file system as the original files, you must have
adequate free disk space before beginning a rolling upgrade.
Whether a member is running on tagged files is controlled by that
member's
sysconfigtab
rolls_ver_lookup
variable.
The upgrade commands set
the value to
1
when a member must run on tagged
files, and to
0
when a member must not run on
tagged files.
If a member's
sysconfigtab
rolls_ver_lookup
attribute is set to
1
, pathname resolution includes determining whether
a specified filename has a
.Old..filename
copy
and whether the copy has the
DEC_VERSION_TAG
property set on it.
If both conditions are met, the requested file
operation is transparently diverted to use the
.Old..filename
version
of the file.
Therefore, if the
vdump
command is
issued on a member that has not rolled, the
/sbin/.Old..vdump
file is executed; if the
command is issued on a member that has rolled, the
/sbin/vdump
file is executed.
The only member
that never runs on tagged files is the lead member (the first member
to roll).
Note
File system operations on directories are not bound by this tagged file restraint. For example, an
ls
of a directory on any cluster member during a rolling upgrade lists both versions of a file. However, the output of anls -ail
command on a member that has not rolled is different from the output on a member that has rolled. In the following examples thels -ail
command is run first on a member that has not rolled and then on a member that has rolled. (Theawk
utility is used to print only the inode, size, month and day timestamp, and name of each file.)The following output from the
ls
command is taken from a cluster member running with tags before it has rolled. The tagged files are the same as their untagged counterparts (same inode, size, and timestamp). When this member runs thehostname
command, it runs the tagged version (inode 3643).# cd /sbin # ls -ail hostname .Old..hostname ls .Old..ls init .Old..init |\ awk '{printf("%d\t%d\t%s %s\t%s\n",$1,$6,$7,$8,$10)}` 3643 16416 Aug 24 .Old..hostname 3648 395600 Aug 24 .Old..init 3756 624320 Aug 24 .Old..ls 3643 16416 Aug 24 hostname 3648 395600 Aug 24 init 3756 624320 Aug 24 ls
The following output from the
ls
command is taken from a cluster member running without tags after it has rolled. The tagged files now differ from their untagged counterparts (different inode, size, and timestamp). When this member runs thehostname
command, it runs the non-tagged version (inode 1370).# cd /sbin # ls -ail hostname .Old..hostname ls .Old..ls init .Old..init |\ awk '{printf("%d\t%d\t%s %s\t%s\n",$1,$6,$7,$8,$10)}` 3643 16416 Aug 24 .Old..hostname 3648 395600 Aug 24 .Old..init 3756 624320 Aug 24 .Old..ls 1187 16528 Mar 12 hostname 1370 429280 Mar 12 init 1273 792640 Mar 12 ls
After you create tagged files in the setup stage, we recommend that
you run any administrative command, such as
tar
,
from a member that has rolled.
You can always run commands on the lead
member because it never runs on tagged files.
The following rules determine which files have tagged files automatically created for them in the setup stage:
Tagged files are created for inventory files for the following product
codes: base operating system (OSF
), TruCluster Server
(TCR
), and Worldwide Language Support
(IOS
).
(The subsets for each product use that
product's three-letter product code as a prefix for each subset
name.
For example, TruCluster Server subset names start with the
TruCluster Server three-letter product code:
TCRBASE540
,
TCRMAN540
, and
TCRMIGRATE540
.)
By default, files that are associated with other layered products do not have tagged files created for them. Tagged files are created only for layered products that have been modified to support tagged files during a rolling upgrade.
Caution
Unless a layered product's documentation specifically states that you can install a newer version of the product on the first rolled member, and that the layered product knows what actions to take in a mixed-version cluster, we strongly recommend that you do not install either a new layered product or a new version of a currently installed layered product during a rolling upgrade.
The
clu_upgrade
command provides several
tagged
command options to manipulate tagged files:
check
,
add
,
remove
,
enable
, and
disable
.
When dealing with tagged files, take the
following into consideration:
During a normal rolling upgrade you do not have to manually add or
remove tagged files.
The
clu_upgrade
command calls
the
tagged
commands as needed to control the
creation and removal of tagged files.
If you run a
clu_upgrade tagged
command, run the
check
,
add
, and
remove
commands on a member that is not running on
tagged files; for example, the lead member.
You can run the
disable
and
enable
commands on
any member.
The target for a
check
,
add
, or
remove
tagged file operation is a product code that
represents an entire product.
The
clu_upgrade
tagged
commands operate on all inventory files for the
specified product or products.
For example, the following command
verifies the correctness of all the tagged files created for the
TCR
kernel layered product (the TruCluster Server
subsets):
# clu_upgrade tagged check TCR
If you inadvertently remove a
.Old..
copy of a
file, you must create tagged files for the entire layered product to
re-create that one file.
For example, the
vdump
command is in the
OSFADVFS
xxx
subset,
which is part of the
OSF
product.
If you mistakenly
remove
/sbin/.Old..vdump
, run the following
command to re-create tagged files for the entire layered product:
# clu_upgrade tagged add OSF
The
enable
and
disable
commands enable or
disable the use of tagged files by a cluster member.
You
do not have to use
enable
or
disable
during a normal rolling upgrade.
The
disable
command is useful if you have to undo
the setup stage.
Because no members can be running with tagged files when
undoing the setup stage, you can use the
disable
command to
disable tagged files on any cluster member that is currently running on
tagged files.
For example, to disable tagged files for a member whose
ID is 3:
# clu_upgrade tagged disable 3
The
enable
command is provided in case you make a
mistake with the
disable
command.
A version switch manages the transition of the active version to the new version of an operating system. The active version is the one that is currently in use. The purpose of a version switch in a cluster is to prevent the introduction of potentially incompatible new features until all members have been updated. For example, if a new version introduces a change to a kernel structure that is incompatible with the current structure, you do not want cluster members to use the new structure until all members have updated to the version that supports it.
At the start of a rolling upgrade, each member's
active version is the same as its new version.
When a member rolls,
its new version is updated.
After all members have rolled, the
switch stage sets the active version to the new version on all members.
At the
completion of the upgrade, all members' active versions are again
the same as their new versions.
The following simple example uses an
active version of
1
and a new version of
2
to illustrate the version transitions during a
rolling upgrade:
All members at start of roll: active (1) = new (1) Each member after its roll: active (1) != new (2) All members after switch stage: active (2) = new (2)
The
clu_upgrade
command uses the
versw
command, which is described in
versw
(8)clu_upgrade
command manages all
the version switch activity when rolling individual members.
In the
switch stage, after all members have rolled, the following command
completes the transition to the new software:
# clu_upgrade switch
7.10 Rolling Upgrade and Layered Products
This section discusses the interaction of layered products and rolling upgrades:
General guidelines (Section 7.10.1)
Blocking layered products (Section 7.10.2)
The
clu_upgrade setup
command prepares a cluster
for a rolling upgrade of the operating system.
Do not use the
setld
command to load software onto
the cluster between performing the
clu_upgrade setup
command and rolling the first cluster member to the new version.
If you install software between performing the
clu_upgrade setup
command and rolling a cluster
member to the new version, the new files will not have been processed by
clu_upgrade setup
.
As a result, when you roll
the first cluster member, these new files will be overwritten.
If you must load software:
Wait until at least one member has rolled.
Install the software on a member that has rolled.
7.10.2 Blocking Layered Products
A blocking layered product is a product that prevents the
installupdate
command from completing.
Blocking
layered products must be removed from the cluster before starting a
rolling upgrade that will include running the
installupdate
command.
You do not have to remove
blocking layered products when performing a rolling upgrade solely to
patch the cluster or install an NHD kit.
Table 7-6
lists blocking layered products for this
release.
Table 7-6: Blocking Layered Products
Product Code | Description |
3X0 | Open3D |
4DT | Open3D |
ATM | Atom Advanced Developers Kit |
DCE | Distributed Computing Environment |
DNA | DECnet |
DTA | Developer's Toolkit (Program Analysis Tools) |
DTC | Developer's Toolkit (C compiler) |
MME | Multimedia Services |
O3D | Open 3D |
PRX | PanoramiX Advanced Developers Kit |
Notes
The three-letter product codes are the first three letters of subset names. For example, a subset named ATMBASExxx is part of the ATM product (Atom Advanced Developers Kit), which is a blocking layered product. However, a subset named OSFATMBINxxx contains the letters ATM, but the subset is not part of a blocking layered product; it is a subset in the OSF product (the base operating system).
When a blocking layered product is removed as part of the rolling upgrade, it is removed for all members. Any services that rely on the blocking product will not be available until the roll completes and the blocking layered product is reinstalled.
When performing the install stage of a rolling upgrade, you can load the base operating system subsets from a CD-ROM or from a Remote Installation Services (RIS) server.
Note
You can use RIS only to load the base operating system subsets.
To use RIS, you must register both the lead member and the default
cluster alias with the RIS server.
When registering for operating
system software, you must provide a hardware address for each host
name.
Therefore, you must create a hardware address for the default
cluster alias in order to register the alias with the RIS server.
(RIS
will reject an address that is already in either of the RIS server's
/etc/bootptab
or
/var/adm/ris/clients/risdb
files.)
If your cluster uses the cluster alias virtual MAC (vMAC) feature, register that virtual hardware address with the RIS server as the default cluster alias's hardware address. If your cluster does not use the vMAC feature, you can still use the algorithm that is described in the vMAC section of the Cluster Administration manual to manually create a hardware address for the default cluster alias.
A vMAC address consists of a prefix (the default is AA:01)
followed by the IP address of the alias in hexadecimal format.
For
example, the default vMAC address for the default cluster alias
deli
whose IP address is
16.140.112.209
is
AA:01:10:8C:70:D1
.
The address is derived in the
following manner:
Default vMAC prefix: AA:01 Cluster Alias IP Address: 16.140.112.209 IP address in hex. format: 10.8C.70.D1 vMAC for this alias: AA:01:10:8C:70:D1
Another method for creating a hardware address is to append an
arbitrary string of eight hexadecimal numbers to the default vMAC
prefix,
AA:01
.
For example,
AA:01:00:00:00:00
.
Make sure that the address is
unique within the area served by the RIS server.
If you have more than
one cluster, remember to increment the arbitrary hexadecimal string
when adding the next alias.
(The vMAC algorithm is useful because it
creates an address that has a high probability of being unique within
your network.)