This chapter describes how you configure and generate system crash dumps and how you save and store crash dumps and their associated data using either the Graphical User Interface or manually. Crash dumps are a snapshot of the running kernel, taken automatically when the system shuts down unexpectedly. Crash dumps are referenced most often when you contact your technical support representatives to analyze and correct problems that result in a system crash. However, if you are an experienced system administrator or developer you may be familiar with techniques of crash dump analysis and you may want to take and analyze your own dump files.
The following topics are discussed in this chapter:
An overview of crash dumps (Section 14.1)
A discussion of two new graphical user interfaces for crash dump configuration and creating a crash dump "snap shot" (Section 14.2)
A discussion on how to create a crash dump (Section 14.3)
Information on choosing the content and method of a crash dump (Section 14.4)
Instructions on how to take a crash dump manually (Section 14.5)
Information on how to store and archive crash dumps (Section 14.6)
When a system shuts down unexpectedly, it writes all
or part of the data in physical memory either a) to swap space on disk (the
virtual memory space) or b) to memory.
Such shutdown events are referred to
as system crashes or panics.
The stored data and status information is called
a crash dump.
Crash dumps differ from the core dumps produced by an application,
after which the system usually keeps running.
After a crash dump, the system
is shut down to the console prompt (>>>
) and may or may
not need to be rebooted, depending on the
auto_action
Boot
Halt Restart option.
During the reboot process, the system moves the crash dump into a file and copies the kernel executable image to another file. Together, these files are the crash dump files and are often required for analysis when a system crashes or during the development of custom kernels (debugging). You may need to supply a crash dump file to your technical support organization to analyze system problems.
To administer dumps, you must understand how crash dump files are created.
Also, you must reserve space on disks for the crash dump and crash dump files.
The amount of space you reserve depends on your system configuration and
the type of crash dump you want the system to perform.
14.1.1 Related Documentation and Utilities
Crash dumps make use of the virtual memory swap space provided on disk.
Administering the swap space is described in
Chapter 3.
System event management is described in
Chapter 12, which
describes the
binlogd
and
syslogd
event
management channels.
Additional information on crash dumps and related topics is available
in
manuals
and
reference pages.
14.1.1.1 Manuals
The following lists manuals that provide useful information for crash dumps and related topics.
The Kernel Debugging manual provides information on analyzing crash dumps. You may need to install software development subsets and appropriate licenses to use this feature.
The Installation Guide manual provides information on the initial swap space and dump settings configured during installation.
The reference pages listed here provide further information regarding associated utilities.
savecore
(8)The program that copies dump data from swap partitions or from memory to a file.
expand_dump
(8)Decompresses a kernel crash dump file.
dumpsys
(8)Copies a snapshot of memory to a dump file without halting the system. This is known as a continuable dump and is useful for estimating crash dump size during dump configuration planning.
sysconfig
(8)sysconfigdb
(8)Maintains the
kernel subsystem configuration and is used to set kernel crash dump attributes
that control crash behavior.
You can use the Kernel Tuner graphical user interface
(/usr/bin/X11/dxkerneltuner
) to modify kernel attributes.
See
dxkerneltuner
(8)
swapon
(8)Specifies additional files for paging and swapping. Use this command if you need to add additional temporary or permanent swap space to produce full dumps.
dbx
(1)The source level debugger.
14.1.1.3 SysMan Menu Applications
Applications for configuring and creating crash dumps are available from the SysMan Menu:
Use this application to
configure the generic system configuration variables associated with the
savecore
command.
Use this application to configure
the
dumpsys
command, which dumps a snapshot of memory manually.
See
Section 14.2
for more information.
14.1.2 Files Used During Crash Dumps
By default,
the
savecore
command copies a crash dump file into the
/var/adm/crash
directory, although you can redirect crash dumps
to any file system that you designate and also to a remote host.
In common
with many other system directories, the
/var/adm/crash
directory is a context-dependent symbolic link (CDSL), which facilitates joining
systems into clusters.
The CDSL for this directory is
/var/cluster/members/member0/adm/crash
.
Within this directory, the following files are created or used:
/var/adm/crash/bounds
A text
file specifying the incremental number of the next dump (the
n
in
vmzcore.n
)
/var/adm/crash/minfree
A file that specifies the minimum number of kilobytes to be left after crash dump files are written
/var/adm/crash/vmzcore.n
The crash dump file, named
vmcore.n
if the file is not
compressed (no
z
)
/var/adm/crash/vmunix.n
A copy of the kernel that was running at the time of the crash, typically
of
/vmunix
/etc/syslog.conf
,
/etc/binlog.conf
,
and
/etc/evmdaemon.conf
The logging configuration files
There are two applications that simplify the processes of configuring
crash dumps and creating crash dump files manually.
These applications are
available from the
Support and Services
branch of the SysMan Menu.
The first application is Configure System Dump. Its purpose is to configure the parameters of the system dump so that you have the appropriate information for your needs should a crash dump occur in the future.
The second application is Create Dump Snapshot.
It allows you to set
various options and to take a snapshot of memory, which is stored in a file
for examination when you cannot halt the system to generate a crash dump.
14.2.1 Using the Configure System Dump Application
The Configure System Dump application lets you tailor the crash dump data according to your needs. This application allows you to set various options that influence the crash dump file should a crash dump occur in the future.
You can access this application from the SysMan Menu by selecting
Support and Services
then selecting
Configure Dump
.
Figure 14-1
shows the main window of this application.
Figure 14-1: Configure System Dump application
After you invoke this application from the SysMan Menu, you can provide the following information:
The first selection,
Enable Dumps
, requires
that you choose one of the following:
Disables the mechanism to generate a crash dump.
Enables the mechanism so that one set of crash dump files (the crash dump file and a copy of the kernel) is written, should a crash dump occur.
Also enables the mechanism so that a set of crash dump files is written, should a crash dump occur. This option also provides for a subsequent set of crash dump files if an additional system fault occurs while the crash dump files are written.
In the second selection, you can choose a Full or Partial dump.
A full dump saves the crash dump header information and all physical memory.
A partial dump saves the crash dump header and copy of the part of the physical memory believed to contain significant information at the time of the system crash selected portion of physical memory.
You may choose to compress the crash dump file with the
Enable Compression
check box.
You should always enable compression
unless some reason dictates otherwise.
The next selection,
Dump Location
, specifies
how the crash dump data is stored:
Saves the crash dump file to disk. If this fails, a partial compressed memory dump is attempted.
Saves the crash dump file to the memory space.
Saves the crash dump file to disk; no attempt of a partial compressed memory file is attempted on failure.
In
the final selections, you can specify whether or not the crash dump file should
be dumped to exempted memory.
If so, select the
Use Exempted Memory
check box to enable the following two fields:
Specify the starting memory address where the dump should be saved.
Specify the size of the memory region.
Note
These fields accept decimal and hexadecimal entries. Be sure to precede all hexadecimal entries wth
0x
.
The Configure System Dump application offers online help, which provides
more information.
14.2.2 Using the Create Dump Snapshot Application
The Create Dump Snapshot application, illustrated in Figure 14-2 , allows you to save a snapshot of system memory to a dump file.
You can access this application from the SysMan Menu by selecting
Support and Services
then selecting
Create Dump Snapshot
.
Figure 14-2
shows the main window of this application.
Figure 14-2: Create Dump Snapshot application
After you invoke this application from the SysMan Menu, you can provide the following information:
Designate a full or partial dump.
Specify whether or not you want the data compressed.
If so, use the
Compression Ratio %
slide bar to specify
the compression ratio; a lower value increases the compression, if possible.
Indicate whether the utility should suppress contiguous zeroes
with the
Disable Zero Suppression
check box.
This suppression
is not recommended.
Select the
Ignore insufficient space warning
check box unless you want the application to warn you if there was not enough
space to save the crash dump data.
Enter the full pathname for the directory,
where you would like the crash dump file to be written, in the
Dump
Directory
field.
The number of megabytes available in that directory
is displayed in the
Megabytes Available in
field.
Select
Update MB
to update that display field.
The Create Dump Snapshot application offers online help which provides
more information.
14.3 Crash Dump Creation
After a system
crash, you normally reboot your system by issuing the
boot
command at the console prompt.
During a system reboot, the
savecore
command moves crash dump information from the swap partitions or
memory into a file and copies the kernel that was running at the time of the
crash into another file.
You can analyze these files to help you determine
the cause of a crash.
The
savecore
command also logs the
crash in system log files.
You can invoke the
savecore
command from the command
line.
See
savecore
(8)14.3.1 Setting Dump Kernel Attributes in the Generic Subsystem
You can control the way that a crash dump is taken by setting kernel
attributes defined in the
generic
subsystem, as follows:
dump_savecnt
Limits the number of successful crash dumps that are generated for a single crash and reboot sequence or disables dumping. See Section 14.3.2.
dump_to_memory
Specifies whether primary system core dumps are written to memory or to disk. See Section 14.3.2.
dump_sp_threshold
Controls the partitions to which the crash dump is written. The default value causes the primary swap partition to be used exclusively for crash dumps that are small enough to fit the partition. See Section 14.3.4.
dump_user_pte_pages
Specifies whether or not you want to include user page tables in partial crash dumps. This attribute is off by default. See Section 14.4.2.
expected_dump_compression
Specifies the level of compression that you typically expect the system to achieve. The setting is 500 by default, but can be an integer from 0 to 1000. See Section 14.4.4.
partial_dump
Specifies whether a partial crash dump or a full crash dump is preserved. This attribute is on by default. See Section 14.4.3.
compressed_dump
Specifies whether a dump is compressed to save space. This attribute is on by default. Even if set to off, the value of other dump attributes may cause it to be automatically set to on. See Section 14.4.5 and also Section 14.4.6.
dump_kernel_text
Enables or disables the inclusion of kernel text pages in the dump creating a larger dump file. This attribute only applies when partial dumps are enabled. See Section 14.4.3.
live_dump_dir_name
Specifies the full path to the directory where continuable dumps are written. See Section 14.5.1.
live_dump_zero_suppress
Enables or disables zero compression of continuable dumps. Dump files take slightly longer to create but occupy less space. See Section 14.5.1.
If available, dumping to exempt memory is controlled by the following attributes:
dump_exmem_addr
Identifies the starting address (virtual or physical) for a region of exempt memory used for writing primary dumps.
dump_exmem_size
Specifies the size (in bytes) of the exempt memory region to which dumps are written.
dump_exmem_include
Specifies whether or not exempt memory pages are included in the dump.
See Section 14.4.6 for a description of this feature.
The following command displays typical dump attribute settings:
# sysconfig -q generic | grep dump compressed_dump = 1 dump_exmem_addr = 0 dump_exmem_size = 0 dump_exmem_include = 0 dump_kernel_text = 0 dump_savecnt = 1 dump_sp_threshold = 4096 dump_to_memory = 0 dump_user_pte_pages = 0 expected_dump_compression = 500 live_dump_zero_suppress = 1 live_dump_dir_name = /var/adm/crash partial_dump = 1
See
sys_attrs_generic
(5)sysconfig
(8)sysconfigdb
(8)14.3.2 Crash Dump File Creation
When the
savecore
command begins running during the reboot process, it determines whether a
crash dump occurred and whether the file system contains enough space to
save it.
(The system saves no crash dump if you shut it down and reboot
it; that is, the system saves a crash dump only when it crashes.)
The value of the
dump_savecnt
attribute controls
the number of dumps.
Possible values are:
0
(zero)Never generate a crash dump.
1
Generate a primary crash dump (the default).
2
Generate a secondary crash dump.
The value of the
dump_to_memory
attribute controls
the location of dumps and interacts with the value of the
dump_savecnt
attribute as follows:
-1
Writing dumps to memory
is disabled.
This value also disables writing a secondary dump when the value
of the
dump_savecnt
attribute is 2.
0
(zero)Dumps are written to disk except in the event of disk failure, in which case they are written to memory. This is the default behavior.
1
Dumps are written only to
memory when sufficient memory is available.
A special case is if secondary
dumps are enabled (dump_savecnt=2
).
See
sys_attrs_generic
(5)
Under certain circumstances, dumps in memory may be overwritten. To prevent an overwrite from happening, you can write dumps to a protected region of memory called exempt memory. See Section 14.4.6 for more information.
If a crash dump exists and the file system contains enough space to
save the crash dump files, the
savecore
command moves
the crash dump and a copy of the kernel into files in the default crash directory,
/var/adm/crash
.
(You can modify the location of the crash directory.)
You can choose to:
Write all crash files to a remote host using a network connection as described in Section 14.4.7.
Write continuable dump files to an alternate directory as described in Section 14.5.1.
The
savecore
command stores the kernel image in
the
vmunix.n
file, and by
default it stores the (compressed) contents of physical memory in the
vmzcore.n
file.
The
n
variable specifies the number of the
crash, which is recorded in the
bounds
file in the crash
directory.
After the first crash, the
savecore
command
creates the
bounds
file and stores the number 1 in it.
The command increments that value for each succeeding crash.
The
savecore
command runs early in the reboot process
so that little or no system swapping occurs before the command runs.
This
practice helps ensure that crash dumps are not corrupted by swapping.
14.3.3 Crash Dump Logging
After the
savecore
command writes the crash dump
files, it performs the following steps to log the crash in system log files:
Writes a reboot message to the
/var/adm/syslog/auth.log
file.
If the system crashed because of a panic condition, the panic string is included in the log entry.
You can cause the
savecore
command to write the
reboot message to another file by modifying the
auth
facility
entry in the
syslog.conf
file.
If you remove the
auth
entry from the
syslog.conf
file, the
savecore
command does not save the reboot message.
Attempts to save the kernel message buffer from the crash dump.
The kernel message buffer contains messages created by the kernel that crashed. These messages may help you determine the cause of the crash.
The
savecore
command saves the kernel message buffer in the
/var/adm/crash/msgbuf.savecore
file, by default.
You can change
the location to which
savecore
writes the kernel message
buffer by modifying the
msgbuf.err
entry in the
/etc/syslog.conf
file.
If you remove the
msgbuf.err
entry from the
/etc/syslog.conf
file,
savecore
does not save the kernel message buffer.
Later in the reboot process, the
syslogd
daemon
starts up, reads the contents of the
msgbuf.err
file,
and moves those contents into the
/var/adm/syslog/kern.log
file, as specified in the
/etc/syslog.conf
file.
The
syslogd
daemon then deletes the
msgbuf.err
file.
See
syslogd
(8)
Attempts to save the binary event buffer from the crash dump.
The binary event buffer contains messages that can help you identify the problem that caused the crash, particularly if the crash resulted from a hardware error.
The
savecore
command saves the binary event buffer
in the
/usr/adm/crash/binlogdumpfile
file by default.
You can change the location to which
savecore
writes the
binary event buffer by modifying the
dumpfile
entry in
the
/etc/binlog.conf
file.
If you remove the
dumpfile
entry from the
/etc/binlog.conf
file,
savecore
does not save the binary event buffer.
Later in the reboot process,
the
binlogd
daemon starts up, reads the contents of the
/usr/adm/crash/binlogdumpfile
file, and moves those contents into
the
/usr/adm/binary.errlog
file, as specified in the
/etc/binlog.conf
file.
The
binlogd
daemon
then deletes the
binlogdumpfile
file.
See
binlogd
(8)
The system may crash before all kernel events are handled
and posted.
In such cases, the
savecore
program recovers
such events and stores them for later processing.
This recovery happens only
if any such events are available and if the
savecore
program
is able to extract and save the events successfully.
By default, the events
are stored in the
/var/adm/crash/evm.buf
file.
See
savecore
(8)EVM
(5)
When the system creates a crash dump to disk, it writes the dump to the swap partitions. The system uses the swap partitions because the information stored in those partitions has meaning only for a running system. After the system crashes, the information is useless and can be overwritten safely.
Before the system writes a crash dump, it determines how the dump fits
into the swap partitions, which are defined in the
/etc/sysconfigtab
file.
For example, the following fragment of the
/etc/sysconfigtab
file entry shows three swap partitions available:
vm: swapdevice=/dev/disk/dsk0b, /dev/disk/dsk3h, /dev/disk/dsk13g vm-swap-eager=1
The following list describes how the system determines where to write the crash dump:
If the crash dump fits in the primary swap partition it is
dumped to the first partition listed under
swapdevice
in
the
/etc/sysconfigtab
file.
The system writes the dump
as far toward the end of the partition as possible, leaving the beginning
of the partition available for boot-time swapping.
If the crash dump is too large for the primary swap partition,
but fits the secondary or tertiary swap space, the system writes the crash
dump to the other swap partitions,
/dev/disk/dsk3h
and
/dev/disk/dsk13g
.
If the crash dump is too large for any of the available swap partitions, the system writes the crash dump spanning the secondary and tertiary swap partitions until those partitions are full. If it requires more space, it then writes the remaining crash dump information starting from the end of the primary swap partition (possibly filling the primary swap partition also).
If the aggregate size of all the swap partitions is too small to contain the crash dump, the system creates no crash dump.
Each crash dump contains a header, which the system always writes
to the end of the primary swap partition.
The header contains information
about the size of the dump and where the dump is stored.
This information
allows
savecore
to find and save the dump at system reboot
time.
In most cases, compressed dumps fit on the primary swap partition.
The next section describes
dump_sp_threshold
, which is
relevant in understanding how a crash dump is created.
The use of the remaining
kernel attributes controls the content of the dump.
These attributes are described
in
Section 14.4.
Controlling the Use of Swap Partitions
You can configure the system so that it fills the secondary swap partitions
with dump information before writing any information (except the dump header)
to the primary swap partition.
The attribute that you use to configure where
crash dumps are written first is the
dump_sp_threshold
attribute.
The value in the
dump_sp_threshold
attribute indicates
the amount of space you normally want available for swapping as the system
reboots.
By default, this attribute is set to 16,384 blocks, meaning that
the system attempts to leave 8 MB of disk space open in the primary swap partition
after the dump is written.
Figure 14-3
shows the default setting of the
dump_sp_threshold
attribute for a 40 MB swap partition.
(40 MB is
not typical of a swap partition size on most systems, the example uses small
numbers for the sake of simplicity.)
Figure 14-3: Default dump_sp_threshold Attribute Setting
The system can write 32 MB of dump information to the primary swap partition shown in Figure 14-3. Therefore, a 30 MB dump fits on the primary swap partition and is written to that partition. However, a 40 MB dump is too large; the system writes the crash dump header to the end of the primary swap partition and writes the rest of the crash dump to secondary swap partitions, if available.
Setting the
dump_sp_threshold
attribute to a high
value causes the system to fill the secondary swap partitions before it writes
dump information to the primary swap partition.
For example, if you set the
dump_sp_threshold
attribute to a value that is equal to the size
of the primary swap partition, the system fills the secondary swap partitions
first.
(Setting the
dump_sp_threshold
attribute is described
in
Section 14.4.1.)
Figure 14-4
shows how a crash dump is written to secondary swap partitions on multiple
devices.
Figure 14-4: Crash Dump Written to Multiple Devices
If a noncompressed crash dump fills partition
e
in
Figure 14-4, the system writes the remaining crash dump information
to the end of the primary swap partition.
The system fills as much of the
primary swap partition as is necessary to store the entire dump.
The dump
is written to the end of the primary swap partition to attempt to protect
it from system swapping.
However, the dump can fill the entire primary swap
partition and may be corrupted by swapping that occurs as the system reboots.
Estimating Crash Dump Size Using dumpsys
To estimate
the size of crash dumps, you can use the
dumpsys
command,
which produces a run time or continuable dump.
See
Section 14.5.1
for information on using the
dumpsys
command.
You may need
to temporarily create file system space to hold the experimental dumps.
You
can produce both full and partial dumps using this method.
Crash dumps are
compressed by default unless you specify the
dumpsys
-u
command option.
You use the
expand_dump
command
to produce a noncompressed dump from the compressed output of the
dumpsys
command.
Because the crash dumps written to swap are about the same size as their resulting saved crash dump files, you can easily determine how large a crash dump was by examining the size of the resulting crash dump file. For example, to determine the size of the first crash dump file created by your system, enter the following command:
# ls -s /var/adm/crash/vmzcore.0 20480 vmzcore.0
This command displays the number of 512-byte blocks occupied by the crash dump file. In this case, the file occupies 20,480 blocks, so you know that a crash dump written to the swap partitions also occupies about 20,480 blocks.
In some cases, a system contains so much active memory that it cannot
store a crash dump on a single disk.
For example, suppose your system contains
2 GB of memory but only has several 4 GB disks, most of which are dedicated
to storing data.
Crash dumps for this system may be too large to fit on a
single swap partition on a single device.
To cause crash dumps to spread
across multiple disks, create a second (and perhaps tertiary) swap partitions
on several disks.
The system automatically writes dumps that are too large
to fit in the specified portion of the primary swap partition to other available
swap partitions.
14.3.5 Planning Crash Dump Space
Because crash dumps are written to the swap partitions
on your system, you allow space for crash dumps by adjusting the size of your
swap partitions, thereby creating temporary or permanent swap space.
See
swapon
(8)
Note
Be sure to list all permanent swap partitions in the
/etc/sysconfigtab
file. Thesavecore
command, which copies the crash dump from swap partitions to a file, uses the information in the/etc/sysconfigtab
file to find the swap partitions. If you omit a swap partition from the/etc/sysconfigtab
file, thesavecore
command may be unable to find the omitted partition.
Space requirements can vary from system to system.
During the installation
procedure the following algorithm to calculate required space in the
/var
file system is used:
3 * memsize / 24MB + 3 * 15MB
Where
memsize
is the amount of physical memory
in megabytes and 15MB is the approximate size of a custom kernel.
This algorithm
allows for the preservation of three dumps.
The following sections give you
guidelines for estimating the amount of space required for partial and full
crash dumps on your system.
In addition, setting the
dump_sp_threshold
attribute is described.
14.3.6 Planning and Allocating File System Space for Crash Dump Files
Using the information on typical crash dump sizes for your system, you
can plan and allocate the file system space that you need for the
/var/adm/crash
directory.
For example, suppose you save partial crash dumps. Your system has 96 MB of memory and you have reserved 85 MB of disk space for crash dumps and swapping. In this case, you should reserve 20 MB of space in the file system for storing crash dump files. You need to reserve considerably more space if you want to save files from more than one crash dump. If you want to save files from multiple crash dumps, consider archiving older crash dump files. See Section 14.6 for information about archiving crash dump files.
By
default, the
savecore
command writes crash dump files to
the
/var/adm/crash
directory.
To reserve space for crash
dump files in the default directory, you must mount the
/var/adm/crash
directory on a file system that has a sufficient amount of disk
space.
(For information about mounting file systems, see
Chapter 6
and
mount
(8)
If your system cannot save crash dump files because of insufficient
disk space, the system returns to single-user mode.
This return to single-user
mode prevents system swapping from corrupting the crash dump.
When in single-user
mode, you can make space available in the crash directory or change the crash
directory.
One possibility in this situation is to issue the
savecore
command at the single-user mode prompt.
On the command line, specify
the name of a directory that contains a sufficient amount of file space to
save the crash dump files.
For example, the following
savecore
command writes crash dump files to the
/usr/adm/crash2
directory:
# savecore /usr/adm/crash2
After the
savecore
command has saved the crash dump files, you can bring up your system
in multiuser mode, or bring up the network to dump remotely to another host
using the
ftp
command.
Specifying a directory on the
savecore
command line
changes the crash directory only for the duration of that command.
If the
system crashes later and the system startup script invokes the
savecore
script, the
savecore
command copies
the crash dump to files in the default
/var/adm/crash
directory.
You
can control the default location of the crash directory by setting the SAVECORE_DIR
variable with the
rcmgr
command.
For example, to save crash
dump files in the
/usr/adm/crash2
directory by default
(at each system startup), issue the following command:
# /usr/sbin/rcmgr set SAVECORE_DIR /usr/adm/crash2
If you want the system to return to multiuser mode, regardless of whether it saved a crash dump, issue the following command:
# /usr/sbin/rcmgr set SAVECORE_FLAGS M
14.4 Choosing the Content and Method of Crash Dumps
Crash dumps are compressed and partial by default, but can be full, noncompressed, or both. Normally, partial crash dumps provide the information that you need to determine the cause of a crash. However, you may want the system to generate full crash dumps if you have a recurring crash problem and partial crash dumps are not helpful in finding the cause of the crash.
A partial crash dump contains the following:
The crash dump header
A copy of part of physical memory
The system writes the part of physical memory believed to contain significant information at the time of the system crash, basically kernel node code and data. By default, the system omits user page table entries.
A full crash dump contains the following:
The crash dump header
A copy of the entire contents of physical memory at the time of the crash
You can modify how crash dumps are taken:
By adjusting the crash dump threshold
By overriding the default so that the system writes user page table entries to partial crash dumps
By selecting partial or full crash dumps
By revising the expected dump compression
By selecting compressed or noncompressed crash dumps
These options are explained in the following sections.
14.4.1 Adjusting the Primary Swap Partition's Crash Dump Threshold
To configure your system so that it writes even small crash dumps to
secondary swap partitions before the primary swap partition, use a large value
for the
dump_sp_threshold
attribute.
The value you assign
to this attribute indicates the amount of space that you normally want available
for system swapping after a system crash, as described in
Section 14.3.
To adjust the
dump_sp_threshold
attribute, issue
the
sysconfig
command.
For example, suppose your primary
swap partition is 40 MB.
To raise the value so that the system writes crash
dumps to secondary partitions, issue the following command:
# sysconfig -r generic dump_sp_threshold=81920
In the preceding example, the
dump_sp_threshold
attribute, which is in the
generic
subsystem,
is set to 81,920 512-byte blocks (40 MB).
In this example, the system attempts
to leave the entire primary swap partition open for system swapping.
The
system automatically writes the crash dump to secondary swap partitions and
the crash dump header to the end of the primary swap partition.
The
sysconfig
command changes the value of system
attributes for the currently running kernel.
To store the new value of the
dump_sp_threshold
attribute in the
sysconfigtab
database, modify that database by using the
sysconfigdb
command.
For information about the
sysconfigtab
database
and the
sysconfigdb
command, see
sysconfigdb
(8)
Note
After the
savecore
program has copied the crash dump to a file, all swap devices are immediately available for mounting and swapping. The sharing of swap space only occurs for a short time during boot, and usually on systems with a small amount of physical memory.
14.4.2 Including User Page Tables in Partial Crash Dumps
By default, the system omits user page tables from partial crash dumps. These tables do not normally help you determine the cause of a crash and omitting them reduces the size of crash dumps and crash dump files. However, your technical support person may instruct you to include user page tables for crash dump analysis.
To include user page tables in partial crash
dumps, set the value of the
dump_user_pte_pages
attribute
to 1.
The
dump_user_pte_pages
attribute is in the
generic
subsystem.
The following example shows the command you issue
to set this attribute:
# sysconfig -r generic dump_user_pte_pages = 1
The
sysconfig
command changes the value of system
attributes for the currently running kernel.
To store the new value of the
dump_user_pte_pages
attribute in the
sysconfigtab
database, modify that database by using the
sysconfigdb
command or use the Kernel Tuner GUI (dxkerneltuner
).
To return to the system default of not writing user page tables to partial
crash dumps, set the value of the
dump_user_pte_pages
attribute
to 0 (zero).
14.4.3 Selecting Partial or Full Crash Dumps
By default, the system generates partial crash dumps.
If you want
the system to generate full crash dumps, you can modify the default behavior
by setting the kernel's
partial_dump
variable to 0 (zero)
as follows:
# sysconfig -r generic partial_dump=0 partial_dump: reconfigured # sysconfig -q generic partial_dump generic: partial_dump = 0
You
can use the Kernel Tuner GUI or the
sysconfigdb
command
to modify kernel entries and preserve the modifications across reboots.
To
return to partial crash dumps, reset the
partial_dump
variable
to 1.
When partial dumps are enabled, you can enable the
dump_kernel_text
attribute to include kernel text pages.
14.4.4 Expected Dump Compression
The
expected_dump_compression
variable is used to
signal how much compression you typically expect to achieve in a dump .
By
default, the value of
expected_dump_compression
is set
to 500, the median for a minimum allowed value of 0 (zero) and a maximum value
of 1000.
The following steps describe how you calculate the appropriate
expected_dump_compression
variable for your system:
Create a compressed dump, using the
dumpsys
command, as described in
Section 14.5.1.
Using the
ls -s
command, record the size of this dump as value
a
.
Use the
expand_dump
command to produce
a noncompressed version of the dump.
Using the
ls -s
-s
command, record the size of this dump as value
b
.
Divide
a
by
b
to produce
the approximate compression ratio.
Repeat the previous steps several times and choose the largest value of the compression ratio. Multiply the compression ratio by 1000 to produce an expected dump value.
Add 10 percent of the expected dump value to create a value
for the
expected_dump_compression
variable.
Set the kernel's
expected_dump_compression
variable to the required value using the
sysconfig
command
as follows:
# sysconfig -r generic expected_dump_compression=750 expected_dump_compression: reconfigured # sysconfig -q generic partial_dump generic: expected_dump_compression=750
You can also use the
Kernel Tuner GUI or the
sysconfigdb
command to modify kernel
entries and preserve the modifications across reboots.
14.4.5 Selecting and Using Noncompressed Crash Dumps
By default, crash dumps are compressed to save disk space, allowing
you to dump a larger crash dump file to a smaller partition.
This can offer
significant advantages on systems with a large amount of physical memory,
particularly if you want to tune the system to discourage swapping for realtime
operations.
On reboot after a crash, the
savecore
command
runs automatically and detects that the dump is compressed, using information
in the crash dump header in the swap partition.
It then copies the crash dump
file from the swap partition to the
/var/adm/crash
directory.
The compressed crash dump files are identified by the letter
z
in the file name, to distinguish them from noncompressed crash dump files.
For example:
vmzcore.1
.
You can use this type of compressed crash dump file with some debugging
tools such as
dbx
, which is not true of the type of compression
produced by tools such as
compress
or
gzip
.
If you need to use a tool that does not support compressed crash dump files,
you can convert it to a conventional noncompressed format with the
expand_dump
utility.
The following example shows how you use the
expand_dump
utility:
# expand_dump vmzcore.2 vmcore.2
You may want to disable
compressed dumps if you always use tools or scripts that do not work with
the compressed format, and it is not convenient to use the
expand_dump
command.
To disable compressed dumps, use the following
sysconfig
command:
# sysconfig -r generic compressed_dump=0
The preceding command temporarily changes the mode of dumping to noncompressed
and the mode reverts to compressed dumps on the next reboot.
To make the change
persistent, use the
sysconfigdb
command to update the value
of the
compressed_dump
attribute in the
/etc/sysconfigtab
file or use the Kernel Tuner GUI to modify the value in the
generic
subsystem.
Note
Memory dumps are compressed. If the
compressed_dump
system attribute is not set, the system automatically enables compression before attempting to write a memory dump.
See
savecore
(8)expand_dump
(8)sysconfig
(8)14.4.6 Dumping to Exempt Memory
Exempt memory is a region of physical memory that is set aside for a
specific purpose.
You can create an exempt region of memory by specifying
it in the
/etc/sysconfigtab
file.
This causes the exempt
region to be created when the system boots.
For example:
cma_dd: CMA_Option = Size-0x3000000, Alignment - 0, / Addrlimit - 0x4000000, Type - 0x96, Flag-0
The
preceding
/etc/sysconfigtab
file entry reserves a region
of exempt memory that is 48MB in size.
Its
Type
is specified
as
M_EXEMPT
by the value
0x96
,
the value of
Addrlimit
sets the starting position of the
exempt region, which at
0x4000000
is 64MB into physical
memory.
Each time the machine boots, it attempts to reserve this same area
of physical memory, making it unavailable for any other use.
Another way of creating exempt regions of memory is by using the
contig_malloc()
function call with the type
M_EXEMPT
in a pseudodevice driver.
See the
malloc.h
file for information
on the
M_EXEMPT
type.
See
contig_malloc
(9r)
You can use the
vmstat
command with the
-M
option to examine exempt memory regions.
To dump to exempt memory, the
dump_to_memory
attribute
must be enabled as described in
Section 14.3.2.
You also configure
the following attributes as required:
dump_exmem_size
Specifies the
size (in bytes) of the exempt memory region to which dumps are written.
By
default, the value is
0
(zero), which disables writing
a dump to an exempt memory region.
dump_exmem_addr
Identifies the starting address (virtual or physical) for a region of exempt memory used for writing primary dumps.
dump_exmem_include
Specifies
whether or not exempt memory pages are included in the dump.
By default,
the value is
0
(zero) and exempt memory pages are excluded.
The setting of the
dump_exmem_addr
attribute has
no effect unless you also configure the
dump_exmem_size
attribute.
Ensure that you keep a record of any run-time settings for the
attributes so that you are able to find the crash dump after recovery from
a system failure.
The following example shows how you reconfigure these attributes:
# sysconfig -q generic dump_to_memory generic: dump_to_memory = 0 # sysconfig -r generic dump_to_memory=1 dump_to_memory: reconfigured # sysconfig -q generic dump_to_memory generic: dump_to_memory = 1
Memory dumps are compressed by default.
The
compressed_dump
system attribute automatically is enabled if it is not set to on.
The
savecore
command uses the
vmzcore
character special device file to recover the compressed dumps.
See
savecore
(8)vmzcore
(7)14.4.7 Dumping to a Remote Host
Use the
savecore
command with the
-r
option to write crash dump files from a client host to a remote host using
an ftp connection.
You can specify either of the following definitions for
a remote destination:
The name of the remote host and a valid account and password
The path to a configuration file containing the ftp connection and login information
For example, the following command specifies a login to the remote host in verbose mode, which enables you to debug the ftp connection.
# savecore -v -r soserv:jeffdump:Cr$hDeBuG
When
it connects to the target host, the
savecore
utility directs
the remote
ftpd
server daemon to create a directory named
after the client host name.
The crash dump files (bounds
,
msgbuf.savecore
,
evm.buf
,
vmunix.N
, and
vmcore.N
or
vmzcore.N
)
are written to the directory.
You must ensure that you have adequate space
for the crash dump on the remote device.
See
savecore
(8)ftpd
(8)14.5 Generating a Crash Dump Manually
The following sections describe how you can create a crash dump file manually under two conditions:
Use the
dumpsys
command to copy a snapshot of the running memory to a dump file without halting
the system.
(That is, the system continues to run.)
Use the
crash
console
command to cause a crash dump file to be created on a system that is not responding
(that is, hung).
It is assumed that you have planned adequate space for the crash dump
file and set any kernel parameters as described in the preceding sections.
14.5.1 Continuable Dumps on a Running System
When you cannot halt the system and take a normal crash dump, use the
dumpsys
command to dump a snapshot of memory.
Because the system
is running while the
dumpsys
command takes a snapshot,
memory may change as its content is copied.
Analysis of the resulting dump
can demonstrate incomplete linked lists and partially zeroed pages, which
are not problems, but reflect the transitory state of memory.
For this reason,
some system problems cannot be detected by using the
dumpsys
command and you may need to halt the system and force a crash dump as described
in
Section 14.5.2.
By default, the
dumpsys
command writes the crash dump in the
/var/adm/crash
directory.
The
/var/adm/crash/minfree
text file specifies
the minimum number of kilobytes that must be left on the file system after
the
dumpsys
command copies the dump.
By default, this file
does not exist, indicating that no minimum is set.
To specify a minimum, create
the file and store the number of kilobytes you want reserved in it.
You can
override the setting in the
minfree
file by using the
-i
option.
The
-s
option displays the approximate
number of disk blocks that full and partial dumps require.
The exact size
can not be determined ahead of time for the following reasons:
For noncompressed dumps only, the actual dump optimizes disk space by default, suppressing the writing of contiguous zeroes.
System use of kernel dynamic memory (malloc
/free
) changes on the running system.
The number of indirect disk blocks required to store the dump is unknown.
The following examples show a dump from a system with 512 KB of physical memory. The examples show a noncompressed crash dump. Dumps are usually compressed by default:
# dumpsys -s Approximate full dump size = 1048544 disk blocks, if compressed, expect about 524272 disk blocks. Approximate partial dump size = 94592 disk blocks, if compressed, expect about 47296 disk blocks. # dumpsys -i /userfiles Saving 536797184 bytes of image in /userfiles/vmzcore.0 # ls /userfiles bounds vmzcore.0 vmunix.0
Two attributes in the
generic
kernel subsystem enable
you to control continuable dumps:
live_dump_dir_name
Specifies
a path to the directory where the continuable dump files are written.
The
default value is the
/var/adm/crash
directory.
live_dump_zero_suppress
Enables or disables zero compression of continuable dumps. Using this option produces files that take longer to create but occupy less space.
See
dumpsys
(8)sys_attrs_generic
(5)14.5.2 Forcing Crash Dumps on a Hung System
You can force the system to create a crash dump when the system hangs. On most hardware platforms, you force a crash dump by following these steps:
If your system has a switch for enabling and disabling the Halt button, set that switch to the Enable position.
Press the Halt button.
At the console prompt, enter the
crash
command.
Some systems have no Halt button. In this case, follow these steps to force a crash dump on a hung system:
Type Ctrl/p at the console prompt.
At the console prompt, enter the
crash
command.
If your system hangs and you force a crash dump, the panic string recorded in the crash dump is the following:
hardware restart
This panic string is always the one recorded when
system operation is interrupted by pressing the Halt button or by typing Ctrl/p.
14.6 Storing and Archiving Crash Dump Files
If you are working entirely with compressed (vmzcore.n
) crash dump files, they should be compressed for
efficient archiving.
The following sections discuss certain special cases.
Section 14.6.1 describes how to compress files for storage or transmission if:
You are working with uncompressed (vmcore.n
) crash dump files.
You need the maximum amount of compression possible for example, if you need to transmit a crash dump file over a slow transmission line.
Section 14.6.2
describes how to uncompress
partial crash dump files that are compressed from
vmcore.n
files.
14.6.1 Compressing a Crash Dump File
To compress a
vmcore.n
crash dump file, use a utility such as
gzip
,
compress
, or
dxarchiver
.
For example, the following
command creates a compressed file named
vmcore.3.gz
:
# gzip vmcore.3
A
vmzcore.n
crash dump file uses a special
compression method that makes it readable by debuggers and crash analysis
tools without requiring decompression.
A
vmzcore.n
file is compressed substantially compared to the
equivalent
vmcore.n
file,
but not as much as if the
vmcore.n
file is compressed using a standard UNIX compression utility, such as
gzip
.
Standard compression applied to a
vmzcore.n
file makes the resulting file about 40 percent
smaller than the equivalent
vmzcore.n
file.
If you need to apply the maximum compression possible to a
vmzcore.n
file, follow these steps:
Uncompress the
vmzcore.n
file by using the
expand_dump
command; see
expand_dump
(8)vmcore.3
from the
vmzcore.3
file:
# expand_dump vmzcore.3
Compress the resulting
vmcore.n
file using a standard UNIX utility.
The following example uses the
gzip
command to create a compressed file named
vmcore.3.gz
:
# gzip vmcore.3
You can uncompress a
vmzcore.n
file only with the
expand_dump
command.
(Do not use
gunzip
,
uncompress
, or any other utility).
After
you uncompress a
vmzcore.n
file into a
vmcore.n
file
by using the
expand_dump
command, you cannot compress it
back into a
vmzcore.n
file.
14.6.2 Uncompressing a Partial Crash Dump File
This section applies only if you are uncompressing a partial crash dump
file that was previously compressed from a
vmcore.n
file.
If you compress a
vmcore.n
dump file from a partial crash dump, you must use care when you uncompress
it.
Using the
gunzip
or
uncompress
command with no options results in a
vmcore.n
file that requires space equal to the size of memory.
In other words, the
uncompressed file requires the same amount of disk space as a
vmcore.n
file from a full crash dump.
This situation occurs because the original
vmcore.n
file contains UNIX File System (UFS) file holes.
(UFS files can contain regions, called holes, which have no associated data
blocks.) When a process, such as the
gunzip
or
uncompress
command, reads from a hole in a file, the file system
returns zero-valued data.
Thus, memory omitted from the partial dump is
added back into the uncompressed
vmcore.n
file as disk blocks containing all zeros.
To ensure that the uncompressed core file remains at its partial dump
size, you must pipe the output from the
gunzip
or
uncompress
command with the
-c
option to the
dd
command with the
conv=sparse
option.
For example,
to uncompress a file named
vmcore.0.Z
, issue the following
command:
# uncompress -c vmcore.0.Z | dd of=vmcore.0 conv=sparse 262144+0 records in 262144+0 records out