LSM helps you protect the availability and reliability of data but does not prevent I/O failure. LSM is simply another layer added to the I/O subsystem. LSM depends on the underlying disk device drivers and system files to decide on the availability of individual disks and to manage and report any failures.
This chapter describes how to troubleshoot common LSM problems, describes tools that you can use to learn about problems, and offers possible solutions.
The hot-spare feature provides the best protection for volumes that
use mirror plexes or a RAID5 plex.
When enabled, the hot-spare
feature allows LSM to automatically relocate data from a failed disk in a
volume that uses either a RAID5 plex or mirrored plexes.
LSM writes the data to a designated hot-spare disk, or to free disk space,
and sends you mail about the relocation.
For more information about enabling
the hot-spare feature, see
Section 3.5.1.
6.1 Troubleshooting LSM Objects
You can use LSM commands to monitor the status of LSM objects.
By doing
so, you can understand how LSM works under normal conditions and watch for
indications that an LSM object might need attention before a problem arises.
6.1.1 Monitoring LSM Events
By default, LSM uses Event Manager (EVM) software to log events.
The
events that LSM logs are defined in the EVM template called
/usr/share/evm/templates/sys/lsm.volnotify.evt.
You can select, filter, sort, format, and display LSM events using EVM commands or the graphical event viewer, which is integrated with the SysMan Menu and SysMan Station.
To display a list of logged LSM events:
# evmget -f "[name *.volnotify]" | evmshow -t "@timestamp @@"
To display LSM events in real time:
# evmwatch -f "[name *.volnotify]" | \ evmshow -t "@timestamp @@"
For more information, see
EVM(5)
You can also display events directly with the
volnotify
command.
For more information, see
volnotify(8)6.1.2 Monitoring Read and Write Statistics
You can use the
volstat
command to view:
The number of successful or failed read and write operations
The number of blocks read from and written to
The average time spent on read and write operations. This time reflects the total time it took to complete a read or write operation, including the time spent waiting in a queue on a busy device.
Note
In TruCluster environments, the
volstatcommand reports statistic information only for the system where the command is entered. It does not provide aggregate statistic information for the whole TruCluster environment.
Table 6-1
describes some of the options you
can use with the
volstat
command.
Table 6-1: Common volstat Command Options
| Option | Displays |
| -v | Volume statistics |
| -p | Plex statistics |
| -s | Subdisk statistics |
| -d | LSM disk statistics |
| -i seconds | The specified statistics continuously in the interval specified (in seconds). |
You can also reset statistics for a specific LSM object (such as a volume or a disk) or for all LSM objects.
For information on all the
volstat
options, see
volstat(8)6.1.2.1 Displaying Read and Write Statistics
To display read and write statistics for LSM objects:
# volstat [-g disk_group] -vpsd [-i number_of_seconds]
OPERATIONS BLOCKS AVG TIME(ms)
TYP NAME READ WRITE READ WRITE READ WRITE
dm dsk6 3 82 40 62561 8.9 51.2
dm dsk7 0 725 0 176464 0.0 16.3
dm dsk9 688 37 175872 592 3.9 9.2
dm dsk10 29962 0 7670016 0 4.0 0.0
dm dsk12 0 29962 0 7670016 0.0 17.8
vol v1 3 72 40 62541 8.9 56.5
pl v1-01 3 72 40 62541 8.9 56.5
sd dsk6-01 3 72 40 62541 8.9 56.5
vol v2 0 37 0 592 0.0 10.5
pl v2-01 0 37 0 592 0.0 8.0
sd dsk7-01 0 37 0 592 0.0 8.0
sd dsk12-01 0 0 0 0 0.0 0.0
pl v2-02 0 37 0 592 0.0 9.2
sd dsk9-01 0 37 0 592 0.0 9.2
sd dsk10-01 0 0 0 0 0.0 0.0
pl v2-03 0 6 0 12 0.0 13.3
sd dsk6-02 0 6 0 12 0.0 13.3
6.1.2.2 Displaying Failed Read and Write Statistics
To display failed I/O statistics:
# volstat [-g disk_group] -f cf LSM_object
For example:
# volstat -f cf testvol
CORRECTED FAILED
TYP NAME READS WRITES READS WRITES
vol testvol 1 0 0 0
LSM corrects read failures for mirror plexes or a RAID5
plex, because these plexes provide data redundancy.
6.1.3 Monitoring LSM Object States
The kernel and LSM monitor the state of LSM objects.
To display the state of LSM volumes, plexes, and subdisks:
# volprint -vps
To display the state of LSM volumes:
# volprint -vt
Disk group: rootdg V NAME USETYPE KSTATE STATE LENGTH READPOL PREFPLEX v ka1 fsgen ENABLED ACTIVE 2097152 SELECT - v ka2 fsgen ENABLED ACTIVE 2097152 SELECT - v ka3 fsgen ENABLED ACTIVE 2097152 SELECT - v ka4 fsgen ENABLED ACTIVE 2097152 SELECT - v rootvol root ENABLED ACTIVE 524288 ROUND - v swapvol swap ENABLED ACTIVE 520192 ROUND - v vol-dsk25g fsgen ENABLED ACTIVE 2296428 SELECT - v vol-dsk25h fsgen ENABLED ACTIVE 765476 SELECT - Disk group: dg1 V NAME USETYPE KSTATE STATE LENGTH READPOL PREFPLEX v volS fsgen ENABLED ACTIVE 204800 SELECT -
The KSTATE column shows the kernel state of the LSM object.
The STATE
column shows the LSM state of the LSM object.
6.1.3.1 Overview of LSM Kernel States
The LSM kernel state indicates the accessibility of the LSM object as
viewed by the kernel.
Table 6-2
describes kernel states
for LSM objects.
Table 6-2: LSM Volume Kernel States (KSTATE)
| Kernel State | Means |
| ENABLED | The LSM object is accessible and read and write operations can be performed. |
| DISABLED | The LSM object is not accessible. |
| DETACHED | Read and write operations cannot be performed, but device operations are accepted. |
6.1.3.2 Overview of LSM Object States
LSM monitors the states of volumes, plexes, and subdisks.
Table 6-3 describes the LSM volume states. The meaning of some volume states differs depending on the kernel state (KSTATE).
Table 6-4 describes the plex states.
Table 6-5 describes the subdisk states.
Table 6-3: LSM Volume States (STATE)
| State | Kernel State | Means |
EMPTY |
DISABLED |
The volume contents are not initialized. |
CLEAN |
DISABLED |
The volume is not started.
|
ACTIVE |
ENABLED |
The volume was started or was in use when the system was restarted. |
ACTIVE |
DISABLED |
RAID5 parity synchronization is not guaranteed or mirror plexes are not guaranteed to be consistent. |
SYNC |
ENABLED |
The system is resynchronizing mirror plexes or RAID5 parity. |
SYNC |
DISABLED |
The system was in the process of resynchronizing mirror plexes or RAID5 parity when the system restarted and therefore the volume still needs to be synchronized. |
NEEDSYNC |
| The volume requires a resynchronization operation the next time it starts. |
REPLAY |
| A RAID5 volume is in a transient state as part of a log replay. A log replay occurs when it is necessary to reconstruct data using parity and data. |
Some plex states are transient; that is, the plex is in a particular
state temporarily, usually while being attached and synchronized with a volume.
Table 6-4: LSM Plex States
| State | Means |
| EMPTY | The plex is not initialized. This state is also set when the volume state is EMPTY. |
| CLEAN | The plex was running normally when the volume was stopped. The plex was enabled without requiring recovery when the volume was started. |
| ACTIVE | The plex is running normally on a started volume. |
| LOG | The plex is a DRL or RAID5 log plex for the volume. |
| STALE | The plex was detached, either by the
volplex det
command or by an I/O failure.
STALE plexes are reattached
automatically by
volplex att
when a volume starts. |
| OFFLINE | The plex was disabled explicitly by the
volmend
off
operation. |
| IOFAIL | The
vold
daemon places an ACTIVE
plex in the IOFAIL state when it detects an error.
The plex is disqualified
from the recovery selection process at volume start time, ensuring that LSM
uses only valid plexes for recovery.
A plex marked IOFAIL is recovered if
possible during a resynchronization. |
| SNAPATT | This is a snapshot plex being attached by
the
volassist snapstart
command.
When the attach is complete,
the state for the plex changes to SNAPDONE.
If the system fails before the
attach completes, the plex and all of its subdisks are removed. |
| SNAPDONE | This is a fully attached snapshot plex created
by the
volassist snapstart
command.
You can turn a plex
in this state into a snapshot volume with the
volassist snapshot
command.
If the system fails before the attach completes, the plex
and all of its subdisks are removed.
|
| SNAPTMP | This is a snapshot plex being attached by
the
volplex snapstart
command.
When the attach is complete,
the state for the plex changes to SNAPDIS.
If the system fails before the
attach completes, the plex is dissociated from the volume. |
| SNAPDIS | This is a fully attached snapshot plex created
by the
volplex snapstart
command.
You can turn a plex in
this state into a snapshot volume with the
volplex snapshot
command.
If the system fails before the attach completes, the plex is dissociated
from the volume. |
| TEMP | The plex is being associated and attached
to a volume with the
volplex att
command.
If the system
fails before the attach completes, the plex is dissociated from the volume. |
| TEMPRM | The plex is being associated and attached
to a volume with the
volplex att
command.
If the system
fails before the attach completes, the plex is dissociated from the volume
and removed.
Any subdisks in the plex are kept. |
| TEMPRMSD | The plex is being associated and attached
to a volume with the
volplex att
command.
If the system
fails before the attach completes, the plex and its subdisks are dissociated
from the volume and removed.
|
| State | Means |
| REMOVED | The subdisk (which might encompass the entire LSM disk) was removed from the volume, disk group, or from LSM control. |
| RECOVER | The subdisk is stale and must be recovered.
Use the
volrecover
command. |
| RELOCATE | The subdisk has failed in a redundant (mirrored or RAID5)
volume.
The
volwatch
daemon checks for this state to identify
which subdisks need to be relocated to available hot-spare disks.
After the
data is relocated, the subdisk state is cleared. |
6.2 Troubleshooting a Missing or Altered sysconfigtab File
During the boot disk encapsulation procedure, LSM adds the following
entries to the
/etc/sysconfigtab
file to enable the system
to boot from the LSM root volume:
lsm: lsm_rootdev_is_volume=1
If this file is deleted or the LSM-specific entries are deleted, the system will not boot. If this happens, do the following:
Boot the system interactively:
>>> boot -fl i
.
.
.
Enter kernel_name option_1... option_n: vmunix
Replace the LSM entries in the
/etc/sysconfigtab
with the following:
lsm: lsm_rootdev_is_volume=1
6.3 Troubleshooting LSM Startup and Command Problems
LSM requires that the
vold
and
voliod
daemons be running.
These daemons are normally started automatically when
the system boots.
If these daemons are not running, the most obvious problems
you might notice are that LSM commands fail to complete or do not respond
as expected, which is an indication that LSM did not correctly start up.
The following sections describe how to determine whether the daemons
are running and how to correct problems.
6.3.1 Checking the Volume Configuration Daemon (vold)
To determine the state of the volume (vold) daemon:
# voldctl mode
Table 6-6
lists the possible output of the
voldctl mode
command, what the output means, and the commands to
enter if the
vold
daemon is disabled or not running.
Table 6-6: vold Status and Solutions
| Command Output | Status | Enter |
Mode: enabled |
Running and enabled | |
Mode: disabled |
Running but disabled | voldctl enable |
Mode: not-running |
Not running | vold |
For more information, see
vold(8)6.3.2 Restarting the Volume Configuration Daemon (vold)
If the volume configuration daemon (vold) stops running
on a system or cluster member, you might see the message "Configuration
daemon is not accessible" if you try to run an LSM command.
You can
attempt to restart the daemon.
The
vold
daemon might stop
running for varied and unpredictable reasons, such as errors that cause it
to dump core.
The
vold
daemon manages changes you make to the LSM
configuration, such as adding LSM disks, creating disk groups and volumes,
adding or removing logs, and so on, but does not have any effect on the accessibility
of LSM volumes.
File systems and other applications that use LSM volumes for
their underlying storage should not be affected by the temporary failure of
the
vold
daemon.
However, until the
vold
daemon is restarted, you cannot make configuration changes or use informational
commands, such as
volprint
and
volstat,
which make use of the daemon to get and display the current configuration
data.
Each member of a cluster runs its own
vold
daemon.
Check the daemon status on all other running members (Section 6.3.1)
and perform the following procedure on each member as necessary.
To restart the
vold
daemon on a standalone system
or a cluster:
Reset the
vold
daemon and start it in disabled
mode:
# vold -k -r reset -m disable
This command stops any
vold
process that is currently
running (or hung) and starts a new
vold
in disabled mode.
If any volumes are in use, the -r reset option fails. In this case, identify the open volumes and stop them temporarily, then retry the command.
Restart the daemon:
# vold -k
Restart any volumes you had to stop.
If this procedure does not restart
vold
or if you
subsequently try to run commands such as
volprint
and get
different error messages, please contact your customer service representative.
6.3.3 Checking the Volume Extended I/O Daemon (voliod)
The correct number of
voliod
daemons automatically
start when LSM starts.
Typically several
voliod
daemons
are running at all times.
The default is at least one
voliod
daemon for each processor on the system or a minimum of two.
To display the number of the
voliod
daemons running:
# voliod 2 volume I/O daemons running
This is the only method for displaying
voliod
daemons,
because the
voliod
processes are kernel threads and are
not listed in the output of the
ps
command.
If no
voliod
daemons are running or if you want to
change the number of daemons, enter the following command where
n
is the number of I/O daemons to start:
# voliod set n
Set the number of LSM I/O daemons to two or the number of central processing units (CPUs) on the system, whichever is greater.
For more information, see
voliod(8)6.4 Troubleshooting LSM Disks
The following sections describe troubleshooting procedures for failing
and failed disks, including the boot disk.
6.4.1 Checking Disk Status
Disks can experience transient errors for a variety of reasons, such
as when a power supply suffers a surge or a cable is accidentally unplugged.
You can verify the status of disks through the output of the
volprint
and
voldisk
commands.
To see the status of an LSM disk:
# voldisk list
To verify the usability of an LSM disk:
# voldisk check disk
For example:
# voldisk check dsk5 dsk5: Okay
The
voldisk
command validates the usability of the
specified disks by testing whether LSM can read and write the
disk header
information.
A disk is considered usable if LSM can write
and read back at least one of the disk headers that are stored on the disk.
If a disk in a disk group is found to be unusable, it is detached from its
disk group and all subdisks stored on the disk become invalid until you replace
the physical disk or reassign the disk media names to a different physical
disk.
Caution
Because an LSM
noprivdisk does not contain a disk header, a failednoprivdisk might incorrectly be reported as usable.
6.4.2 Recovering Stale Subdisks
Stale subdisks have a state of
RECOVER
(Table 6-5).
LSM usually recovers stale subdisks when the volume starts.
However, it is
possible that:
The recovery process might get killed.
The volume might be started with an option to defer subdisk recovery.
The disk supporting the subdisk might have been replaced without any recovery operations being performed.
To recover a stale subdisk in a volume:
# volrecover [-sb] volume
To recover all stale subdisks on an LSM disk:
# volrecover [-sb] disk
6.4.3 Recovering From Temporary Disk Failures
If a disk had a temporary failure but is not damaged (for example, the disk was removed by accident, a power cable was disconnected, or some other recoverable problem occurred that did not involve restarting the system), you can recover the volumes on that disk.
To recover from a temporary disk failure:
Make sure the disk is back on line and accessible; for example:
Confirm that the disk is firmly snapped into the bay.
Reconnect any loose cables.
Perform any other checks appropriate to your system.
Scan for all known disks to ensure the disk is available:
# voldctl enable
Recover the volumes on the disk:
# volrecover -sb disk
6.4.4 Moving LSM Volumes Off a Failing Disk
Often a disk has recoverable (soft) errors before it fails completely. If a disk is experiencing an unusual number of soft errors, move the volume from the disk to a different disk in the disk group and replace the failing disk.
Note
To replace a failed boot disk, see Section 6.4.6.
To move a volume off a failing disk:
Identify the size of the volume:
# volprint [-g disk_group] -ht [volume]
Ensure there is an equal amount of free space in the disk group:
# voldg [-g disk_group] free
If there is not enough space, add a new disk. For more information, see Section 5.2.2.
Move the volume to a disk other than the failing disk, as
specified by the
!
operand.
Use the appropriate shell quoting
convention to correctly interpret the
!.
You do not need
to specify a target disk.
# volassist [-g disk_group] move volume !disk
For more information on replacing a failed disk, see Section 6.4.5.
When an LSM disk fails completely, its state becomes detached. For best results, replace the failed disk with the same size disk.
If hot-sparing was enabled at the time of the disk failure and successfully relocated redundant data, you do not need to follow the procedure below. Instead, you can initialize a new disk for LSM and optionally move the data from the hot-spare disk to the new disk (Section 5.1.6), or you can configure the new disk as a hot-spare disk (Section 3.5.1).
To replace a failed boot disk, see Section 6.4.6.
Note
If the failed disk is part of a hardware stripeset or RAIDset that you set up with a particular chunk size, recreate the same attributes on the replacement disk. For more information on LSM disks comprised of hardware sets, see the Best Practice entitled Aligning LSM Disks and Volumes to Hardware RAID Devices at the following URL:
http://www.tru64unix.compaq.com/docs/best_practices/sys_bps.html
To replace a failed disk:
Identify the disk media name of the failed disk using one of the following commands:
To display all disk, disk group, and volume information and the status of any volumes that are affected by the failed disk:
# volprint -Aht
To display only disk information:
# volprint -Adt
Remove the failed disk media name from its disk group, using the -k option to retain the disk media name:
# voldg [-g disk_group] -k rmdisk disk_media_name
Remove the failed disk access name from LSM control:
# voldisk rm disk_access_name
You must completely remove the device from LSM before running any non-LSM
commands to remove and replace the failed disk, such as
hwmgr redirect.
Physically remove the failed disk and replace it with a new disk.
Scan for the new disk:
# hwmgr scan scsi
The
hwmgr
command returns the prompt before it completes
the scan.
Confirm that the system has discovered the new disk before continuing,
for example, by entering the
hwmgr show scsi
command until
you see the new device.
Label and initialize the new disk, using one of the following procedures.
If you saved a copy of the disk label information for the failed disk when you initialized the disk originally (see Section 4.1.5 for more information), and you want to apply that disk label to the new disk, do the following:
Apply the backup disk label to the new disk:
# disklabel -R disk_access_name file
Initialize the disk for LSM as the same LSM disk type (simple, sliced, or nopriv) as the failed disk, with the same public or private region offset (if applicable) as the failed disk.
Specify the disk access name in the form of either
dskn
(for the entire disk) or
dsknp
(for a partition of the disk):
For a
nopriv
disk:
# voldisk -f init disk_access_name
For a
sliced
disk:
# voldisk -f init disk_access_name \ [puboffset=16]
Make sure the
puboffset
you specify is the same as
it was on the failed disk.
For a
simple
disk on a partition that begins
at block 0 of the disk (for example, the
a
or
c
partition):
# voldisk -f init disk_access_name \ privoffset=16
Make sure the
privoffset
you specify is the same
as it was on the failed disk.
For a
simple
disk on a partition that does
not begin at block 0 of the disk (for example, the
b
or
d
partition):
# voldisk -f init disk_access_name
If you do not have a backup disk label from the failed disk or if you want to initialize the new disk with default values, do the following:
If the new disk is part of a hardware stripeset or raidset that you recreated, clear the old disk label:
# disklabel -z disk_access_name
Apply a default disk label to the new disk:
# disklabel -rwn disk_access_name
Initialize the new disk for LSM as a default sliced disk:
# voldisksetup -i disk
Optionally (but recommended), create a backup copy of the new disk's disk label information:
# disklabel disk_access_name > file
Add the new disk to the applicable disk group, assigning the old disk media name to the new disk:
# voldg [-g disk_group] -k adddisk \ disk_media_name=disk_access_name
For example, if the failed disk media name was
dsk10
and the new disk access name is
dsk82, and the disk group
is
dg03:
# voldg -g dg03 -k adddisk dsk10=dsk82
Start recovery on all applicable LSM volumes:
# volrecover [-sb]
This command initiates plex attach operations, RAID5 subdisk recovery, and resynchronization for all volumes requiring recovery, and resolves most of the problems resulting from a failed disk.
If this does not recover the volumes affected by the disk failure (for example, nonredundant volumes or volumes that had multiple disk failures), see Section 6.5.2 for information on recovering volumes, and Section 5.4.3 for information on restoring volumes from backup.
6.4.6 Replacing Failed Boot Disks
When the boot disk on a standalone system is encapsulated to an LSM volume with mirror plexes, failures occurring on the original boot disk are transparent to all users. However, during a failure, the system might:
Write a message to the console indicating there was an error reading or writing to one plex
Experience slow performance (depending on the problem encountered with the disk containing one of the plexes in the root or swap volumes)
If necessary, before you replace the original boot disk, you can restart
the system from any disk that contains a valid plex for the
rootvol
volume.
If all plexes in the
rootvol
volume
are corrupted and you cannot boot the system, you must reinstall the operating
system.
The following procedure requires an encapsulated the boot disk on a standalone system and mirror plexes for the boot disk volumes. Step 4 in this procedure creates a new (replacement) mirror on the new disk.
To replace a failed boot disk under LSM control with a new disk:
Display detail information about the root and swap volumes to ensure you use the name of the failed disk and failed plex in the following steps:
# volprint -vht
In the output, identify the name of the failed plex or plexes and the disk media name of the failed LSM disk or disks.
When you encapsulate the boot disk, LSM assigns special disk media names
to the in-use partitions on the boot disk.
In the following output from the
voldisk list
command, the original root disk is
dsk14.
The disk used to mirror the root and swap volumes is
dsk15.
DEVICE TYPE DISK GROUP STATUS dsk14a [1] nopriv root01 [2] rootdg online dsk14b [3] nopriv swap01 [4] rootdg online dsk14f [5] simple dsk14f rootdg online dsk14g [6] nopriv dsk14g-AdvFS [7]rootdg online dsk14h [8] nopriv dsk14h-AdvFS [9]rootdg online dsk15a nopriv root02 rootdg online dsk15b nopriv swap02 rootdg online dsk15f simple dsk15f rootdg online dsk15g nopriv dsk15g-AdvFS rootdg online dsk15h nopriv dsk15h-AdvFS rootdg online
The following list describes the output:
Disk access name for the root (/) partition.
[Return to example]
Disk media name for the root (/) partition.
[Return to example]
Disk access name for the primary swap partition. [Return to example]
Disk media name for the primary swap partition. [Return to example]
Disk access name for the LSM private region for the boot disk (same as its disk media name). [Return to example]
Disk access name for the
/usr
partition.
[Return to example]
Disk media name for the
/usr
partition.
[Return to example]
Disk access name for the
/var
partition.
[Return to example]
Disk access name for the
/var
partition.
[Return to example]
The same naming conventions apply to the disk used to mirror the root,
swap,
/usr, and
/var
partition volumes.
Dissociate the plexes on the failed disk from the root, swap,
and user volumes, if
/usr
or
/var
were
encapsulated on the boot disk.
For example:
# volplex -o rm dis rootvol-02 swapvol-02 vol-dsk0g-02
The
/usr, and if separate,
/var
volume names are derived from the partition letter of the boot disk (for example,
vol-dsk0g).
Remove the failed LSM disks for the boot disk:
Remove the disks from the
rootdg
disk group:
# voldg rmdisk dskna dsknb dskng...
Remove the LSM disks configured on the boot disk from LSM control:
# voldisk rm dskna dsknb dskng...
Physically remove and replace the failed disk.
You must completely remove the device from LSM before running any non-LSM
commands to remove and replace the failed disk, such as
hwmgr redirect.
Scan for the new disk:
# hwmgr scan scsi
The
hwmgr
command returns the prompt before it completes
the scan.
You need to confirm that the system has discovered the new disk
before continuing, such as by entering the
hwmgr show scsi
command until you see the new device.
Modify the device special files, reassigning the old disk name to the new disk. Make sure you list the new disk first.
# dsfmgr -e new_name old_name
Label the new disk, setting all partitions to unused:
# disklabel -rw new_disk
Mirror the root volumes to the new disk:
# volrootmir -a new_disk
6.5 Troubleshooting LSM Volumes
The following sections describe how to solve common LSM volume problems.
Alert icons and the Alert Monitor window might provide information when an
LSM volume recovery is needed.
(For more information about the Alert Monitor,
see the
System Administration
manual.)
6.5.1 Recovering LSM Volumes After a System Failure
LSM can usually recover volumes automatically after a system crash. Using a DRL plex on all mirrored volumes (except those used as swap space) and using a RAID5 log plex on all RAID5 volumes speeds volume recovery.
Fast Plex Attach logs also resume operation after a system restart,
with the exception of an active FPA log on the
rootvol
or
cluster_rootvol
volumes.
In this case, LSM disables
FPA logging on both the primary and secondary volumes.
A full resynchronization
occurs if you return the migrant plex to the primary volume.
6.5.2 Recovering LSM Volumes After Disk Failure
Use the following volume recovery procedures (Table 6-7) as applicable to recover volume data after replacing a failed disk or disks (Section 6.4.5).
Note
If the disk failures resulted in the loss of all active copies of the LSM configuration database in a disk group, see Section 5.3.2 before recovering the volumes.
Table 6-7: Volume Recovery Procedures
| To Recover: | See: |
| Mirrored and RAID 5 volumes and logs after a single disk failure | Section 6.5.2.1 |
| Nonredundant volumes | Section 6.5.2.2 |
| Mirrored volumes with no valid (or known) plexes | Section 6.5.2.3 |
| Mirrored volumes with one known valid plex | Section 6.5.2.4 |
| RAID 5 plex after multiple disk failures | Section 6.5.2.5 |
| RAID 5 log plex | Section 6.5.2.6 |
6.5.2.1 Recovering Mirrored and RAID 5 LSM Volumes
You can recover an LSM volume that has become disabled. Recovering a disabled LSM volume starts the volume and, if applicable, resynchronizes mirror plexes or RAID5 parity.
If a redundant (mirrored or RAID5) volume experienced
a single disk failure in a data plex or a log plex, you can recover the volume
with the
volrecover
command, which takes one of the following
actions, appropriate to the situation:
Starts plex resynchronization in a mirrored volume
Initiates data and parity regeneration in a RAID5 volume
Reattaches a detached DRL or RAID5 log plex
If hot-sparing is enabled, you might not need to do anything if suitable disk space was available for relocation at the time of the disk failure. For a RAID5 log plex, relocation occurs only if the log plex is mirrored. If hot-sparing was disabled at the time of a failure, you might need to initiate the recovery.
To recover an LSM volume, enter the following command, specifying either a volume or a disk, if several volumes use the same disk:
# volrecover [-g disk_group] -sb volume|disk
The
-s
option immediately starts the volume but delays
recovery, and the
-b
option runs the command in the background.
(For more information on these and other options, see
volrecover(8)
For example, to recover an LSM volume named
vol01
in the
rootdg
disk group:
# volrecover -sb vol01
To recover all LSM objects (subdisks, plexes, or volumes) that use disk
dsk5:
# volrecover -sb dsk5
Optionally, verify the volume is recovered (or that recovery is underway):
# volprint volume
6.5.2.2 Recovering a Nonredundant Volume
Nonredundant volumes are those with a single concatenated or striped plex. If a disk in the plex fails, the volume will be unstartable, and you must restore the volume's data from backup. Volumes can also become nonredundant if disk failures happen in multiple plexes or in multiple columns of a RAID5 plex.
You can display the volume's condition:
# volinfo -p vol tst fsgen Unstartable plex tst-01 NODEVICE
Note
In the following procedure, assume the disk is still usable or has been replaced (Section 6.4.5).
To recover the volume:
Set the plex state to stale:
# volmend fix stale plex
LSM has internal state restrictions that require a plex to change states in a specific order. A plex state must be stale before it can be set to clean.
Set the plex state to clean:
# volmend fix clean plex
Start the volume:
# volume start volume
The volume is now running and usable but contains invalid data.
Do one of the following:
If the volume was used by a file system, recreate the file system on the volume, and mount the file system. For more information on configuring a volume for a file system, see Section 4.5.
If you have a backup of the data, restore the volume using the backup. For more information on restoring a volume from backup, see Section 5.4.3.
If you have no backup and the volume was used by an application such as a database, see that application's documentation for information on restoring or recreating the data.
6.5.2.3 Recovering a Mirrored Volume with No Valid Plexes
If disks fail in all data plexes in a mirrored volume, the volume will be unstartable. If disks fail in several, but not all, data plexes, the volume's data might be corrupt or suspect. To recover a mirrored volume after multiple disk failures, you must restore the data from backup.
To recover a volume with no valid plexes:
Set the plex state for all the plexes in the volume to clean:
# volmend fix clean plex plex...
Start the volume:
# volume start volume
Depending on what was using the volume, do one of the following:
If the volume was used by a file system, recreate the file system on the volume, and mount the file system. For more information on configuring a volume for a file system, see Section 4.5.
If you have a backup of the data, restore the volume using the backup. For more information on restoring a volume from backup, see Section 5.4.3.
If you have no backup and the volume was used by an application such as a database, see that application's documentation for information on restoring or recreating the data.
6.5.2.4 Recovering a Mirrored Volume with One Valid Plex
If one plex in a volume contains known valid data, you can use that plex to restore the others.
To recover a volume with one valid data plex:
Set the valid data plex's state to clean:
# volmend fix clean valid_plex
Set the state of all the other data plexes to stale:
# volmend fix stale plex plex...
Start the volume and initiate the resynchronization process (optionally in the background):
# volrecover -s [-b] volume
6.5.2.5 Recovering a RAID 5 Plex from Multiple Disk Failures
Volumes that use a RAID 5 plex remain available when one disk fails. However, if disks in two or more columns of a RAID5 data plex fail, LSM cannot use the remaining data and parity to reconstruct the missing data. You must replace the failed disks, restart the volume, and then restore the data.
Note
In the following procedure, assume the failed disks have been replaced (Section 6.4.5).
To recover the volume:
Stop the volume:
# volume stop volume
Set the volume state to empty to force parity recalculation when the volume starts:
# volmend -f fix empty volume
Start the volume, optionally running the operation in the background to return the system prompt immediately:
# volume [-o bg] start volume
The volume is usable while parity recalculation proceeds. If an I/O request falls in a region that has not been recalculated, LSM recalculates and writes the parity for the entire stripe before honoring the I/O request.
Restore the volume data from backup (Section 5.4.3).
If you have no backup and the volume was used by an application such as a database, see that application's documentation for information on restoring or recreating the data.
6.5.2.6 Recovering a RAID 5 Log Plex
A disk failure in a RAID5 log plex has no direct effect on the operation of the volume; however, the loss of all RAID5 logs on the volume makes the volume vulnerable to data loss in the event of a system failure.
The following output from the
volprint
command shows
a failure within a RAID5 log plex.
The plex state is
BADLOG, and the RAID5 log plex
vol5-02
has failed.
(In some cases, RAID5 log plexes might
have a state of
DETACHED
due to disk failures.)
Disk group: rootdg V NAME USETYPE KSTATE STATE LENGTH READPOL PREFPLEX PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE v vol5 raid5 ENABLED ACTIVE 409696 RAID - pl vol5-01 vol5 ENABLED ACTIVE 409696 RAID 8/32 RW sd dsk3-01 vol5-01 dsk3 0 58528 0/0 dsk3 ENA sd dsk4-01 vol5-01 dsk4 0 58528 1/0 dsk4 ENA sd dsk5-01 vol5-01 dsk5 0 58528 2/0 dsk5 ENA sd dsk6-01 vol5-01 dsk6 0 58528 3/0 dsk6 ENA sd dsk7-01 vol5-01 dsk7 0 58528 4/0 dsk7 ENA sd dsk8-01 vol5-01 dsk8 0 58528 5/0 dsk8 ENA sd dsk9-01 vol5-01 dsk9 0 58528 6/0 dsk9 ENA sd dsk10-01 vol5-01 dsk10 0 58528 7/0 dsk10 ENA pl vol5-02 vol5 DISABLED BADLOG 2560 CONCAT - RW sd dsk11-01 vol5-02 dsk11 0 2560 0 - RMOV
If the disk has failed, replace it (Section 6.4.5).
To recover a RAID 5 log plex, reattach the log plex to the volume:
# volplex att volume log_plex
For example:
# volplex att vol5 vol5-02
6.5.3 Starting Disabled LSM Volumes
If you cannot mount a file system that uses an LSM volume, or if an application cannot open an LSM volume, the LSM volume might not be started.
To determine whether or not the LSM volume is started:
# volinfo [-g disk_group] volume
The following output shows the condition of several volumes:
vol bigvol fsgen Startable vol vol2 fsgen Started vol datavol gen Unstartable
LSM volumes can have the following conditions:
Started
- The volume is enabled and
running normally.
Startable
- The volume is not enabled,
and at least one plex has a state of
ACTIVE
or
CLEAN, indicating that the volume can be restarted.
Note
Normally, volumes will not be in this state unless you manually created a volume (not using the
volassistcommand, which starts the new volume automatically), or you did something that disabled the volume, such as removing a plex. All startable volumes are started when the system restarts.
Unstartable
- The volume is not enabled
and has a problem (such as a disk failure) that you must resolve before you
can start the volume.
To replace a failed disk, see Section 6.4.5.
To start a startable volume:
# volume [-g disk_group] start volume
6.5.4 Checking the Status of Volume Resynchronization
If the system fails and restarts, LSM automatically recovers all volumes that were running normally at the time of the failure.
For volumes that use mirror plexes and have a DRL plex, this involves resynchronizing all the dirty regions.
For volumes that use a RAID 5 plex and have a RAID 5 log plex, this involves replaying the log plex to complete any outstanding writes.
Using redundant volumes with log plexes is the recommended method to speed the recovery of volumes after a system failure. Under normal circumstances, the recovery happens so quickly that there is no noticeable effect (such as performance lag) after the system is running again. However, if a volume has no log, the resynchronization can take a long time (minutes to hours or longer) depending on the size of the volume.
You can display the status of the volume resynchronization in progress
to determine how long it will take.
(You cannot check the status of plex resynchronization,
which occurs when you replace a failed disk or add a new plex to a volume;
the
volprint
command does not have access to that information.
However, in these cases, the volume is usable while the resynchronization
occurs.)
To calculate the time remaining for a volume resynchronization in progress:
Display the read/write flags for the volume to see the current recovery offset value:
# volprint -vl volume | grep flags
The following information is displayed:
flags: open rwback (offset=121488) writeback
Display the flags again after some time has passed (120 seconds is ample) to see how far the recovery has progressed:
# sleep 120 ; volprint -vl volume | grep flags
The following information is displayed:
flags: open rwback (offset=2579088) writeback
Calculate the rate of progress by dividing the difference between the offsets by the time that passed between the two displays. For example, in 120 seconds the resynchronization had completed 2457600 sectors. Each second, approximately 20480 sectors (10 MB) were resynchronized.
Multiply the resynchronization rate by the size of the volume in sectors. This indicates the approximate amount of time a complete resynchronization will take. For example, at a rate of 20480 sectors per second, a volume that is 200 GB will take about five and a half minutes to resynchronize.
The actual time required can vary, depending on other I/O loads on the system and whether the volume or the system experiences additional problems or failures.
6.5.4.1 Changing the Rate of Future Volume Resynchronizations
Although you cannot change the rate of (or stop) a volume resynchronization after it has begun, you can change the setting for the rate of future resynchronizations, if your volumes are large enough that the resynchronization has a noticeable impact on system performance during recovery.
Caution
Use this procedure only if you are a knowledgeable system administrator and you have evaluated the effect of volume resynchronization on system performance and determined it to be unacceptable. You should be familiar with editing system files and scripts.
To change the rate of volume resynchronization for future recoveries,
use your preferred editor to modify the indicated line in the
/sbin/lsm-startup
script.
Example 6-1
shows the relevant section of the script,
which has been edited for brevity and formatting.
Example 6-1: Volume Recovery Section of /sbin/lsm-startup Script
#!/sbin/sh
.
.
.
volrecover_iosize=64k
.
.
.
if [ "X`/sbin/voldctl mode 2> /dev/null`" = "Xmode: enabled" ]; then /sbin/volrecover -b -o iosize=$volrecover_iosize -s [1] if [ $is_cluster -eq 1 -a $vold_locked -eq 1 ]
.
.
.
fi
Change the indicated line to one of the following:
To slow the rate of recovery, add
-o slow:
/sbin/volrecover -b -o iosize=$volrecover_iosize -o slow -s
The
-o slow
option inserts a delay of 250ms between
each recovery operation.
This can considerably reduce the performance impact
on the system, depending on the size of the volume and the number of plexes.
To disable resynchronization, add -o delayrecover:
/sbin/volrecover -b -o iosize=$volrecover_iosize -o delayrecover -s
The -o delayrecover option requires that you manually begin a resynchronization at your discretion, such as when the system is not under peak demand. Until then, the volume remains in read-writeback mode, which means that every time a region of the volume is read, the data is written to all plexes in the volume. When you eventually initiate the resynchronization, all regions marked dirty are resynchronized, perhaps unnecessarily.
This option incurs performance overhead by writing all reads back to all plexes, which might be less than the impact of permitting the resynchronization to complete during periods of high system demand.
You can change the
/sbin/lsm-startup
script back
to its original state at any time.
6.5.5 Calculating Sufficient Space to Create LSM Volumes
When you use the
volassist
command to create a volume
with a striped plex, you might receive an error message indicating there is
insufficient space for the volume even though you know there is enough space
available.
The
volassist
command rounds up the length you specify
on the command line to a multiple of the data unit size of 64K bytes by default,
or the stripe width you specified, and then divides the total by the number
of disks available to make the column.
The smallest disk in the disk group
limits the data unit size.
For example, you have two disks with differing free space in the
dg1
disk group:
# voldg -g dg1 free
GROUP DISK DEVICE TAG OFFSET LENGTH FLAGS dg1 dsk1 dsk1 dsk1 0 2049820 - dg1 dsk2 dsk2 dsk2 0 2047772 -
The total free space on these two disks is 4097592. Suppose you tried to create a volume with a striped plex with a length of 4095544 blocks (about 2 GB), which is less than the total space available:
# volassist -g dg1 make NewVol 4095544 layout=stripe
volassist: adjusting length 4095544 to conform to a layout of 2 stripes 128 blocks wide volassist: adjusting length up to 4095744 blks volassist: insufficient space for a 4095744 block long volume in stripe, contiguous layout
The command returned an error message indicating insufficient space,
because
volassist
rounded up the length you specified to
an even multiple of the data unit size of 64K bytes (128 blocks) and divided
that number by the number of disks (2).
The result was larger than the space
available on the smaller disk: 4095744 ÷ 2 = 2048796.
If your volume does not need to be precisely the size you specified, you can retry the command with a length that works with the data unit size and the number of disks. For example, multiply the size of the smallest free space by the number of disks: 2047772 × 2 = 4095488. Use this value in the command line:
# volassist -g dg1 make NewVol 4095488 layout=stripe
If the volume you require is larger than the total free space in the disk group, or if the volume must be exactly the size you specify, add more (or larger) disks to that disk group. For more information on adding disks to a disk group, see Section 5.2.2.
To determine whether the disk group has enough space on enough disks
to create the volume you want, use the
volassist maxsize
command, specifying all the properties of the volume except the size.
For example, to determine whether you can create a volume with three
mirrored, striped plexes and a stripe width of 128K bytes in the
dg1
disk group:
# volassist -g dg1 maxsize layout=stripe nmirror=3 \ stwid=128k Maximum volume size: 16424960 (8020Mb)
You can additionally specify other properties, such as the number of stripe columns; for example:
# volassist -g dg1 maxsize layout=stripe nmirror=3 \ stwid=128k ncolumn=3 lsm:volassist: ERROR: No volume can be created within the given constraints
If you receive a message similar to the previous output, you can try
the same command in a different disk group.
For example, to determine whether
you can create a volume with the same properties in the
rootdg
disk group:
# volassist maxsize layout=stripe nmirror=3 \ stwid=128k ncolumn=3 Maximum volume size: 35348480 (17260Mb)
6.5.6 Clearing Locks on LSM Volumes
When LSM makes changes to an object's configuration, LSM locks the object until the change is written. If a configuration change terminated abnormally, there might still be a lock on the object.
To determine whether an object is locked:
# volprint [-g disk_group] -vh
In the information displayed, the lock appears in the TUTIL0 column.
To clear the lock:
# volmend [-g disk_group] clear tutil0 object...
You might need to restart the volume (Section 5.4.4).
6.6 Troubleshooting Disk Groups
If you receive an error message or the command fails when trying to import a disk group, possible causes are:
One or more of the disks contains the host ID of another system (Section 6.6.1).
One or more of the disks might be inaccessible (Section 6.6.2).
The disk group contains multiple
nopriv
disks (Section 7.3.3).
6.6.1 Resolving Mismatched Host IDs
If a disk group was not deported cleanly from its original system, the disks in the disk group might contain a record of the original host ID. This can happen if the system crashed, you were unable to restart it, and you decided to move its storage to another system or if you disconnected the storage without first deporting the disk group. To import the disk group on the new system, clear the previous host ID.
To determine whether one or more disks contains the host ID of another system:
# voldisk list disk_access_name | grep hostid hostid: potamus.zk3.dec.com
If the host ID of the disk does not match that of the system where you are trying to import the disk group, clear the previous host ID:
# voldisk clearimport disk_access_name
After you resolve the mismatched host IDs for all the disks in the disk
group, you can import the disk group.
If the disk group has the same name
as an existing disk group on the new host system, rename the disk group as
you import it.
For more information, see
voldg(8)6.6.2 Importing Disk Groups with Failed Disks (Forced Import)
To forcibly import a disk group:
# voldg -f import disk_group
After the disk group is imported, you can identify and solve the problem.
If you cannot import the disk group even with the force (-f) option, it might be because LSM cannot find a copy of the configuration database for that disk group. This is unlikely but can happen in the following situations:
All the disks with active copies for that disk group failed.
Normally, when LSM detects the failure of a disk with an active copy,
it enables a copy on another disk in the disk group.
For all disks with active
copies to fail simultaneously is rare but possible.
This is more likely to
happen if you configured a disk group to have fewer than the default number
of copies or if the disk group contains many
nopriv
disks
and few
sliced
or
simple
disks.
Only
sliced
or
simple
disks can store copies of the
configuration database; if these disks fail, LSM has nowhere to place an active
copy.
All the disks with active copies for that disk group are on the same bus (this is not the default) and the bus failed or are in the same RAID array (for example, the same HSG80) and that array failed or is inaccessible.
Both situations can also result in problems with volumes; for example, plex detachments, loss of DRL or RAID5 logs, or total volume loss for nonmirrored volumes.
To restore the configuration database for a disk group, see Section 5.3.2.