2 Gathering System and Performance Information

You must gather a wide variety of performance information to identify performance problems or areas where performance is deficient.

Some symptoms or indications of performance problems are obvious. For example, applications complete slowly or messages appear on the console, indicating that the system is out of resources. Other problems or performance deficiencies are not obvious and can be detected only by monitoring system performance.

There are various commands and utilities that you can use to gather system performance information. It is important that you gather statistics under a variety of conditions. Comparing sets of data will help you diagnose performance problems.

For example, to determine how an application affects system performance, you can gather performance statistics without the application running, start the application, and then gather the same statistics. Comparing different sets of data will enable you to identify whether the application is consuming memory, CPU, or disk I/O resources.

In addition, you must gather information at different stages during the application processing to obtain accurate performance information. For example, an application may be I/O-intensive during one stage and CPU-intensive during another.

This chapter describes how to perform the following tasks:

Using a methodology approach to solve performance problems (Section 2.1)

Obtaining information about system events (Section 2.2)

Using the primary tools for gathering information (Section 2.3)

Using secondary tools to gather information (Section 2.4)

Continuously monitoring performance (Section 2.5)

After you identify a performance problem or an area in which performance is deficient, you can identify an appropriate solution. See Part 2 for information about tuning by application, and see Part 3 for information about tuning by component to improve system performance.

2.1 Methodology Approach to Solving Performance Problems

There are five recommended steps to diagnose a performance problem. Before you begin, you must become familiar with the terminology and concepts relating to performance and availability. See Chapter 1 for more information.

In addition, you must understand how your application utilizes system resources, because not all configurations and tuning guidelines are appropriate for all types of workloads. For example, you must determine if your applications are memory-intensive or CPU-intensive, or if they perform many disk or network operations. See Section 1.8 for information about identifying a resource model for your configuration.

To diagnose performance problems, follow these steps:

Before you begin, you must understand your system hardware configuration. To identify and manage your hardware components use thehwmgr utility (see Section 1.1 or Section 2.3.1 for more information).

Run the sys_check utility, but before you do so perform an analysis of the operating system parameters and kernel attributes that tune the performance of your system. This tool can be used to diagnose performance problems. See Section 2.3.3 for more information.

Verify and know your software configuration errors. You can use sys_check to diagnose performance problems. See Section 2.2 for more information about obtaining information for system events.

Determine what type of application you are using and categorize your application as an Oracle, Network File System, or internet server application. If you are tuning your system by applications, see the following chapters:
- Tuning Oracle (Chapter 4)
- Tuning Network File Systems (Chapter 5)
- Tuning Internet Servers (Chapter 6)

Find the bottleneck or the system resource that is causing a performance degradation. Determine the performance problem by plotting the following information:
- CPU — Idle system time and user time
- Memory — Sum of active or inactive pages that are being used by processes, UBC and wired memory.
- Disk I/O — Transactions per second and blocks per second
Use the collect command to gather performance data while the system is under load or manifesting the performance problem. After you gather the performance information, use the collgui graphical interface to plot the data. For information on how to use collgui, see Section 2.3.2.2. For more information about identifying a resource model for your workload, see Section 1.8.

2.2 Obtaining Information About System Events

Set up a routine to continuously monitor system events that will alert you when serious problems occur. Periodically examining event and log files allows you to correct a problem before it affects performance or availability, and helps you diagnose performance problems.

The system event logging facility and the binary event logging facility log system events. The system event logging facility uses the syslog function to log events in ASCII format. The syslogd daemon collects the messages logged from the various kernel, command, utility, and application programs. This daemon then writes the messages to a local file or forwards the messages to a remote system, as specified in the /etc/syslog.conf event logging configuration file. Periodically monitor these ASCII log files for performance information.

The binary event logging facility detects hardware and software events in the kernel and logs detailed information in binary format records. The binary event logging facility uses the binlogd daemon to collect various event log records. The daemon then writes these records to a local file or forwards the records to a remote system, as specified in the /etc/binlog.conf default configuration file.

You can examine the binary event log files by using the following methods:

The Event Manager (EVM) uses the binary log files to communicate event information to interested parties for immediate or later action. See Section 2.2.1 for more information about EVM.

DECevent is a rules-based translation and reporting utility that provides event translation for binary error log events. EVM uses DECevent's translation facility, dia, to translate binary error log events into human-readable form. Compaq Analyze performs a similar role on some EV6 series processors.
For more information about DECevent, see Section 2.2.2 or dia(8).
For more information on Compaq Analyze, see Section 2.2.3 or ca(8).

In addition, we recommend that you configure crash dump support into the system. Significant performance problems may cause the system to crash, and crash dump analysis tools can help you diagnose performance problems.

See the System Administration manual for more information about event logging and crash dumps.

2.2.1 Using Event Manager

Event Manager (EVM) allows you to obtain event information and communicate this information to interested parties for immediate or later action. Event Manager provides the following features:

Enables kernel-level and user-level processes and components to post events.

Enables event consumers, such as programs and users, to subscribe for notification when selected events occur.

Supports existing event channels such as the binary logger daemon.

Provides a graphical user interface (GUI) that enables users to review events.

Provides an application programming interface (API) library that enables programmers to write routines that post or subscribe to events.

Supports command-line utilities for administrators to configure and manage the EVM environment and for users to post or retrieve events.

See the System Administration manual for more information about EVM.

2.2.2 Using DECevent

The DECevent utility continuously monitors system events through the binary event logging facility, decodes events, and tracks the number and the severity of events logged by system devices. DECevent analyzes system events, attempts to isolate failing device components, and provides a notification mechanism (for example, mail) that can warn of potential problems.

You must register a license to use DECevent's analysis and notification features, or these features may also be available as part of your service agreement. A license is not needed to use DECevent to translate the binary log file to ASCII format.

See the DECevent Translation and Reporting Utility manual for more information.

2.2.3 Using Compaq Analyze

Compaq Analyze is a fault analysis utility designed to provide analysis for single error/fault events, and multiple event and complex analysis. Compaq Analyze provides system analysis that uses other error/fault data sources in addition to the traditional binary error log.

Compaq Analyze provides background automatic analysis by monitoring the active error log and processing events as they occur. The events in the error log file are checked against the analysis rules. If one or more of the events in the error log file meets the conditions specified in the rules, the analysis engine collects the error data and creates a problem report containing a description of the problem and any corrective actions required. Once the problem report is created, it is distributed in accordance with your notification preferences.

Note that recent Alpha EV6 processors are supported only by Compaq Analyze and not DECevent.

You can download the latest version of Compaq Analyze and other Web Based Enterprise Service Suite (WEBES) tools and documentation from the following location:

http://www.compaq.com/support/svctools/webes

Download the kit from the Web site, saving it to /var/tmp/webes. Unpack the kit using a command similar to the following:


# tar -xvf <tar file name>

Use the following command to install the Compaq Web Based Enterprise Service Suite:

# setld -l /var/temp/webes/kit

During the installation, you can safely select the default options. However, you might not want to install all the optional WEBES tools. Only Compaq Analyze is used by EVM. See the separate Compaq Analyze documentation and ca(8) for more information.

2.2.4 Using System Accounting and Disk Quotas

Set up system accounting, which allows you to obtain information about the resources consumed by each user. Accounting can track the amount of CPU usage and connect time, the number of processes spawned, memory and disk usage, the number of I/O operations, and the number of print operations.

You should establish Advanced File System (AdvFS) and UNIX file system (UFS) disk quotas to track and control disk usage. Disk quotas allow you to limit the disk space available to users and to monitor disk space usage.

See the System Administration manual for information about system accounting and UFS disk quotas. See the AdvFS Administration manual for information about AdvFS quotas.

2.3 Primary Tools for Gathering Information

The following utilities are the primary tools for gathering performance information:

hwmgr utility (Section 2.3.1)

collect utility (Section 2.3.2)

sys_check utility (Section 2.3.3)

2.3.1 Gathering Hardware Information Using the hwmgr Utility

The principal command that you use to manage hardware is the hwmgr command-line interface (CLI). Other interfaces, such as the SysMan tasks, provide a limited subset of the features provided by hwmgr. Using the hwmgr command enables you to connect to an unfamiliar system, obtain information about its component hierarchy and allows you to set attributes for specific components.

Use the view command to view the hierarchy of hardware within a system. This command enables you to find what adapters are controlling devices, and discover where adapters are installed on buses. The following example shows the hardware component hierarchy on a small system that is not part of a cluster:

# hwmgr view hierarchy
 
HWID: Hardware component hierarchy
----------------------------------------------
 1: platform AlphaServer 800 5/500
 2:   cpu CPU0
 4:   bus pci0
 5:     scsi_adapter isp0
 6:	scsi_bus scsi0
18:	  disk bus-0-targ-0-lun-0 dsk0
19:	  disk bus-0-targ-4-lun-0 cdrom0
20:     graphics_controller trio0
 8:     bus eisa0
 9:	serial_port tty00
10:	serial_port tty01
11:	parallel_port lp0
12:	keyboard PCXAL
13:	pointer PCXAS
14:	fdi_controller fdi0
15:	  disk fdi0-unit-0 floppy0
16:     network tu0
17:     network tu1
output truncated

Some components might appear as multiple entries in the hierarchy. For example, if a disk is on a SCSI bus that is shared between two adapters, the hierarchy shows two entries for the same device. You can obtain similar views of the system hardware hierarchy by using the SysMan Station GUI. See the System Administration manual for information on running the SysMan Menu. Section 13.2.2.6 describes how to use the graphical interface. See the online help for more information on valid data entries.

To view a specific component in the hierarchy, use the grep command. The following example shows output for the CPU hardware component:

# hwmgr view hierarchy | grep "cpu"
 2:       cpu qbb-0 CPU0
 3:       cpu qbb-0 CPU1
 4:       cpu qbb-0 CPU2
 5:       cpu qbb-0 CPU3
 7:       cpu qbb-1 CPU5
 8:       cpu qbb-1 CPU6
 9:       cpu qbb-1 CPU7
10:       cpu qbb-2 CPU8
11:       cpu qbb-2 CPU9
12:       cpu qbb-2 CPU10
13:       cpu qbb-2 CPU11

The display hierarchy command displays the currently registered hardware components which have been placed in the system hierarchy. Components that have a flagged status are identified in the command output with the following codes:

(!) warning

(X) critical

(-) inactive

See hwmgr(8) for an explanation of these codes.

To view all of the SCSI devices attached to the system (disks and tapes), use the following command:

# hwmgr show scsi

To view how many RAID array controllers can be seen from the host use the following command:


# hwmgr show scsi | grep scp
 
        SCSI                DEVICE   DEVICE  DRIVER NUM  DEVICE FIRST
 HWID:  DEVICEID HOSTNAME   TYPE     SUBTYPE OWNER  PATH FILE   VALID PATH
-------------------------------------------------------------------------
266:    30   		wf99  		disk  		none  		0  	20  	scp0  [2/0/7]  
274:    38   		wf99  		disk  		none  		0  	20  	scp1  [2/1/7]  
282:    46   		wf99  		disk  		none  		0  	20  	scp2  [2/2/7]  
290:    54   		wf99  		disk  		none  		0  	20  	scp3  [2/3/7]  
298:    62   		wf99  		disk  		none  		0  	20  	scp4  [2/4/7]  
306:    70   		wf99  		disk  		none  		0  	20  	scp5  [2/5/7]  
314:    78   		wf99  		disk  		none  		0  	20  	scp6  [2/6/7]  
322:    86   		wf99  		disk  		none  		0  	20  	scp7  [2/7/7]  
330:    94   		wf99  		disk  		none  		0  	20  	scp8  [2/8/7]  
338:    102  		wf99  		disk  		none  		0  	20  	scp9  [2/9/7]  
346:    110  		wf99  		disk  		none  		0  	20  	scp10 [2/10/7] 
354:    118  		wf99  		disk  		none  		0  	20  	scp11 [2/11/7]

The scp in the previous example represents the service control port and is the address that a RAID array (HSG) presents itself for administrative and diagnostic purposes.

For more information about the hwmgr command, see the Hardware Management manual or hwmgr(8).

2.3.2 Gathering System Information by Using the collect Utility

The collect utility is a system monitoring tool that records or displays specific operating system data. It also gathers the vital system performance information for specific subsystems, such as file systems, memory, disk, process data, CPU, network, message queue, LSM, and others. The collect utility creates minimal system overhead and is highly reliable. It also provides extensive and flexible switches to control data collection and playback. You can display data at the terminal, or store it in either a compressed or uncompressed data file. Data files can be read and manipulated from the command line.

To ensure that the collect utility delivers reliable statistics, it locks itself into memory using the page-locking function plock(), and by default cannot be swapped out by the system. It also raises its priority using the priority function nice(). However, these measures should not have any impact on a system under normal load, and they should have only a minimal impact on a system under extremely high load.

You can invoke the collect utility from the collgui graphical user interface or from the command line. If you are using the graphic user interface, run cfilt on the command line to filter collect's data used by collgui and user scripts. For more information see collect(8).

The following example shows how to run a full data collection and display the output at the terminal using the standard interval of 10 seconds:


# /usr/sbin/collect

This command is similar to the output monitoring commands such as vmstat(1), iostat(1), netstat(1), and volstat(8).

Use the -s option to select subsystems for inclusion in the data collection, or use the -e (exclude) option to exclude subsystems from the data collection.

The following output specifies only data from the file system subsystem:


# /usr/sbin/collect -sf
# FileSystem Statistics 
# FS    Filesystem         Capacity    Free
0       root_domain#root     128        30 
1       usr_domain#usr       700        147 
3       usr_domain#var       700        147

The option letters map to the following subsystems:

p — Specifies the process data

m — Specifies the memory data

d — Specifies the disk data

l — Specifies the LSM volume data

n — Specifies the network data

c — Specifies the CPU data

f — Specifies the file system data

tyy — Specifies the terminal data

When you are collecting process data, use the -S (sort) and -n X (number) options to sort data by percentage of CPU usage and to save only X processes. Target-specific processes using the Plist option, where list is a list of process identifiers, comma-separated with blanks.

If there are many (greater than 100) disks connected to the system being monitored, use the -D option to monitor a particular set of disks.

Use the collect utility with the -p option to read multiple binary data files and play them back as one stream, with monotonically increasing sample numbers. You can also combine multiple binary input files into one binary output file, using the -p option with the input files and the -f option with the output file.

The collect utility will combine input files in whatever order you specify on the command line. This means that the input files must be in strict chronological order if you want to do further processing of the combined output file. You can also combine binary input files from different systems, made at different times, with differing subsets of subsystems for which data has been collected. Filtering options such as -e, -s, -P, and -D can be used with this utility.

See collect(8) for more information.

2.3.2.1 Configuring collect to Automatically Start on System Reboot

You can configure collect to automatically start when the system reboots. This is particularly useful for continuous monitoring of subsystems and processes. It is essential for diagnosing problems and performance issues. On each system, use the rcmgr command with the set operation to configure the following values in the /etc/rc.config* file. For example:

% rcmgr set COLLECT_AUTORUN 1

A value of 1 sets collect to automatically start when the system reboots. A value of 0 (the default) causes collect to not start on reboot:

% rcmgr set COLLECT_ARGS " -ol -i 10:60 \ -f /var/adm/collect.dated/collect -H d0:5,1w "

A null value causes collect to start with the following default values:

-i60, 120 -f /var/adm/collect.dated -W 1h -M 10, 15

Direct output from collect should be written to a local file system, not a NFS-mounted file system, to prevent important diagnostic data from being lost during system or network problems and to prevent any consequent system problems arising from collect output being blocked by a nonresponsive file system.

See rcmgr(8) for more information.

2.3.2.2 Plotting collect Datafiles

Use either collgui (a graphical interface for the collect command) or cflit (a filter for the collect command) to export collect datafiles to Excel.

Note

To run collgui, you need Perl and Perl/TK. They are freely downloadable from the collect FTP site: ftp://ftp.digital.com/pub/DEC/collect

To plot information using the collgui graphical interface, follow these steps:

Run collgui in debug mode:
```
>> collgui -d "collect datafile"
```

Select the desired subsystem and click on Display.

Return to the shell that collgui was started in. You will see that collgui has created the /var/tmp directory. The file name is collgui.xxxx, where xxxx are integers. The data file (collgui.xxxx) is exportable to Excel. Copy it to a Windows system.

On your Windows system, start Excel and open collgui.xxxx. You might have to change files of Type: field to "All files (*.*)".

In Excel 2000, a text import wizard will pop up.

In Excel 2000, select data type Delimited, then select Next.

In Excel 2000, select Tabs and Spaces as Delimiters, then select Next.

In the Data Preview pane, select the columns that you want to import using the Shift key select Finish.

You should now see the columns displayed in your worksheet.

To plot information using the cflit collect filter, follow these steps:

Use cfilt to generate the data file. For example, if you choose to display the system time, physical memory used for the process data, and user+system time, wait time for the Single CPU field, enter the following command:
```
cfilt -f "collect datafile" 'sin:WAIT:USER+SYS' 'pro:Systim#:RSS#'
> /var/tmp/collgui.xxxx
```
Copy the collgui data file to your Windows system.

Follow steps 4-9 in the collgui previous procedure.

For more information, see the following Web site: http://www.tru64unix.compaq.com/collect/collect_faq.html

2.3.3 Checking the Configuration by Using the sys_check Utility

The sys_check utility performs an analysis of operating system parameters and kernel attributes that tune the performance of your system. The utility checks memory and CPU resources, provides performance data and lock statistics for SMP systems and for kernel profiles, and outputs any warnings and tuning guidelines.

The sys_check utility creates an HTML file that describes the system configuration, and can be used to diagnose problems. The report generated by sys_check provides warnings if it detects problems with any current settings. Use sys_check utility in conjunction with the event management and system monitoring tools to provide a complete overview and control of system status.

Consider applying the sys_check utility's configuration and tuning guidelines before applying any advanced tuning guidelines.

Note

You may experience impaired system performance while running the sys_check utility. Invoke the utility during offpeak hours to minimize the performance impact.

You can invoke the sys_check utility from the SysMan graphical user interface or from the command line. If you specify sys_check without any command-line options, it performs a basic system analysis and creates an HTML file with configuration and tuning guidelines. Options that you can specify at the command line include:

The -all option provides information about all subsystems, including security information and setld inventory verification.

The -perf option provides only performance data and excludes configuration data. This may take 5 to 10 minutes to complete.

The -escalate option creates escalation files required for reporting problems.

See sys_check(8) for more information.

2.4 Secondary Tools for Gathering Information

The following utilities are the secondary tools used to gather performance information:

Gathering system information:

lockinfo utility (Section 2.4.1)

sched_stat utility (Section 2.4.2)

Gathering network information:

nfsstat utility (Section 2.4.3)

tcpdump utility (Section 2.4.4)

netstat command (Section 2.4.5)

ps axlmp command (Section 2.4.6)

nfsiod daemon (Section 2.4.7)

nfswatch command (Section 2.4.8)

2.4.1 Gathering Locking Statistics by Using the lockinfo Utility

The lockinfo utility collects and displays locking statistics for the kernel SMP locks. It uses the /dev/lockdev pseudodriver to collect data. Locking statistics can be gathered when the lockmode attribute for the generic subsystem is set to 2 (the default), 3, or 4.

To gather statistics with lockinfo, follow these steps:

Start up a system workload and wait for it to get to a steady state.

Start lockinfo with sleep as the specified command and some number of seconds as the specified cmd_args. This causes lockinfo to gather statistics for the length of time it takes the sleep command to execute.

Based on the first set of results, use lockinfo again to request more specific information about any lock class that shows results, such as a large percentage of misses, which is likely to cause a system performance problem.

The following example shows how to gather locking statistics for each processor over a period of 60 seconds:

# lockinfo -percpu sleep 60
hostname:       sysname.node.corp.com 
lockmode:       4  (SMP DEBUG with  kernel mode preemption enabled) 
processors:     4 
start time:     Wed Jun  9 14:45:08 1999 
end time:       Wed Jun  9 14:46:08 1999 
command:        sleep 60 
 
          tries    reads    trmax    misses   percent  sleeps    waitmax  waitsum
                                               misses            seconds  seconds 
bsBuf.bufLock  (S) 
 0      1400786        0    45745     47030       3.4       0    0.00007  0.15526 
 1      1415828        0    45367     47538       3.4       0    0.00006  0.15732 
 2      1399462        0    33076     48507       3.5       0    0.00005  0.15907 
 3      1398336        0    31753     48867       3.5       0    0.00005  0.15934 
 
        -----------------------------------------------------------------------
 
ALL     5614412        0    45745    191942       3.4       0    0.00007  0.63099 
 
lock.l_lock  (S) 
 0      1360769        0    40985     18460       1.4       0    0.00005  0.04041 
 1      1375384        0    20720     18581       1.4       0    0.00005  0.04124 
 2      1375122        0    20657     18831       1.4       0    0.00009  0.04198 
 
         -----------------------------------------------------------------------
 
ALL     5483049        0    40985     74688       1.4       0    0.00009  0.16526 
...inifaddr_lock  (C) 
 0            0        0        1         0       0.0       0    0.00000  0.00000 
 1            1        1        1         0       0.0       0    0.00000  0.00000 
 2            0        0        1         0       0.0       0    0.00000  0.00000 
 3            0        0        1         0       0.0       0    0.00000  0.00000 
 
         -----------------------------------------------------------------------
 
ALL           1        1        1         0       0.0       0    0.00000  0.00000 
 
total simple_locks = 28100338                           percent unknown = 0.0 
total rws_locks = 1466          percent reads = 100.0 
total complex_locks = 2716146   percent reads = 33.2    percent unknown = 0.0

A locking problem is simply an indication that there is high contention for a certain type of resource. If contention exists for a lock related to I/O, and a particular application is spawning many processes that compete for the same files and directories, application or database storage design adjustments might be in order.

Applications that use System V semaphores can sometimes encounter locking contention if they create a very large number of semaphores in a single semaphore set because the kernel uses locks on each set of semaphores. In this case, performance improvements might be realized by changing the application to use more semaphore sets, each with a smaller number of semaphores.

See lockinfo(8) for more information.

2.4.2 Gathering CPU Usage and Process Statistics by Using the sched_stat Utility

The sched_stat utility helps determine how well the system load is distributed among CPUs, what kinds of jobs are getting (or not getting) enough cycles on each CPU, and how well cache affinity is being maintained for these jobs. The sched_stat displays CPU usage and process-scheduling for SMP and NUMA platforms.

To gather statistics with sched_stat, follow these steps:

Start up a system workload and wait for it to get to a steady state.

Start sched_stat with sleep as the specified command and some number of seconds as the specified cmd_arg. This causes sched_stat to gather statistics for the length of time it takes the sleep command to execute.

For example, the following command causes sched_stat to collect statistics for 60 seconds and then print a report:


# /usr/sbin/sched_stat sleep 60

If you include options on the command line, only statistics for the specified options are reported. If you specify the command without any options, all options except for -R are assumed. See sched_stat(8) for more information.

2.4.3 Displaying Network and NFS Statistics by Using the nfsstat Utility

To display or reinitialize NFS and remote procedure call (RPC) statistics for clients and servers, including the number of packets that had to be retransmitted (retrans) and the number of times a reply transaction ID did not match the request transaction ID (badxid), enter:


# /usr/ucb/nfsstat

Information similar to the following is displayed:

Server rpc:
calls     badcalls  nullrecv   badlen   xdrcall
38903     0         0          0        0
 
Server nfs:
calls     badcalls
38903     0
 
Server nfs V2:
null      getattr   setattr    root     lookup     readlink   read
5  0%     3345  8%  61  0%     0  0%    5902 15%   250  0%    1497  3%
wrcache   write     create     remove   rename     link       symlink
0  0%     1400  3%  549  1%    1049  2% 352  0%    250  0%    250  0%
mkdir     rmdir     readdir    statfs
171  0%   172  0%   689  1%    1751  4%
 
Server nfs V3:
null      getattr   setattr    lookup    access    readlink   read
0  0%     1333  3%  1019  2%   5196 13%  238  0%   400  1%    2816  7%
write     create    mkdir      symlink   mknod     remove     rmdir
2560  6%  752  1%   140  0%    400  1%   0  0%     1352  3%   140  0%
rename    link      readdir    readdir+  fsstat    fsinfo     pathconf
200  0%   200  0%   936  2%    0  0%     3504  9%  3  0%      0  0%
commit
21  0%
 
Client rpc:
calls     badcalls  retrans    badxid    timeout   wait       newcred
27989     1         0          0         1         0          0
badverfs  timers
0         4
 
Client nfs:
calls     badcalls  nclget     nclsleep
27988     0         27988      0
 
Client nfs V2:
null      getattr   setattr    root      lookup    readlink   read
0  0%     3414 12%  61  0%     0  0%     5973 21%  257  0%    1503  5%
wrcache   write     create     remove    rename    link       symlink
0  0%     1400  5%  549  1%    1049  3%  352  1%   250  0%    250  0%
mkdir     rmdir     readdir    statfs
171  0%   171  0%   713  2%    1756  6%
 
Client nfs V3:
null      getattr   setattr    lookup    access    readlink   read
0  0%     666  2%   9  0%      2598  9%  137  0%   200  0%    1408  5%
write     create    mkdir      symlink   mknod     remove     rmdir
1280  4%  376  1%   70  0%     200  0%   0  0%     676  2%    70  0%
rename    link      readdir    readdir+  fsstat    fsinfo     pathconf
100  0%   100  0%   468  1%    0  0%     1750  6%  1  0%      0  0%
commit
10  0%

The ratio of timeouts to calls (which should not exceed 1 percent) is the most important thing to look for in the NFS statistics. A timeout-to-call ratio greater than 1 percent can have a significant negative impact on performance. See Chapter 10 for information on how to tune your system to avoid timeouts.

To display NFS and RPC information in intervals (seconds), enter:


# /usr/ucb/nfsstat -s -i number

The following example displays NFS and RPC information in 10-second intervals:

# /usr/ucb/nfsstat -s -i 10

If you are monitoring an experimental situation with nfsstat, reset the NFS counters to 0 before you begin the experiment. To reset counters to 0, enter:

# /usr/ucb/nfsstat -z

See nfsstat(8) for more information about command options and output.

2.4.4 Gathering Information by Using the tcpdump Utility

The tcpdump utility monitors and displays packet headers on a network interface. You can specify the interface on which to listen, the direction of the packet transfer, or the type of protocol traffic to display.

The tcpdump command allows you to monitor the network traffic associated with a particular network service and to identify the source of a packet. It lets you determine whether requests are being received or acknowledged, or in the case of slow network performance, to determine the source of network requests

Your kernel must be configured with the packetfilter option to use the command. For example:

# pfconfig +p +c tu0

The netstat -ni command displays the configured network interfaces. For example:

# netstat -ni
Name  Mtu  Network         Address          Ipkts Ierrs  Opkts Oerrs  Coll
tu0  1500  <Link>        00:00f8:22:f8:05     486404139    010939748   632583736
tu0  1500  16.140.48/24  16.140.48.156        486404139    010939748   632583736
tu0  1500  DLI           none                 486404139    010939748   632583736
sl0*  296  <Link>                               0   0       0  0  0
lo0  4096  <Link>                            1001631086 0 1001631086     0  0
lo0  4096  127/8         127.0.0.1           1001631086 0 1001631086     0  0

Use the netstat command output to determine which interface to use with the tcpdump command. For example:


# tcpdump -mvi tu0 -Nts1500
tcpdump: listening on tu0
Using kernel BPF filter
 
k-1.fc77a110 > foo.pmap-v2: 56 call getport prog "nfs"  V3  prot UDP  port 0 \
(ttl 30, id 20054)
foo.fc77a110 > k-1.pmap-v2: 28 reply getport 2049 (ttl 30, id 36169)
k-1.fd77a110 > foo.pmap-v2: 56 call getport prog "mount"  V3  prot UDP  port 0 \
(ttl 30, id 20057)
foo.fd77a110 > k-1.pmap-v2: 28 reply getport 1030 (ttl 30, id 36170)
k-1.fe77a110 > foo.mount-v3: 112 call mount "/pns2" (ttl 30, id 20062)
foo.fe77a110 > k-1.mount-v3: 68 reply mount  OSF/1 fh 19,17/1.4154.1027680688/4154.\
1027680688 (DF) (ttl 30, id 36171)
k-1.b81097eb > fubar.nfs-v3: 136 call fsinfo OSF/1 fh 19,17/1.4154.1027680688/4154.\
1027680688 (ttl 30, id 20067)

The -s snaplen option displays snaplen bytes of data from each packet rather than the default of 68. The default is adequate for IP, ICP, TCP, and UDP, but 500-1500 bytes is recommended for NFS and RPC adequate results.

See tcpdump(8) and packetfilter(7) for more information.

2.4.5 Monitoring Network Statistics by Using the netstat Command

To check network statistics, use the netstat command. Some problems to look for are:

If the netstat -i command shows excessive amounts of input errors (Ierrs), output errors (Oerrs), or collisions (Coll), this may indicate a network problem; for example, cables are not connected properly or the Ethernet is saturated (see Section 2.4.5.1).

Use the netstat -is command to check for network device driver errors (see Section 2.4.5.2).

Use the netstat -m command to determine if the network is using an excessive amount of memory in proportion to the total amount of memory installed in the system.
If the netstat -m command shows several requests for memory delayed or denied, this means that either physical memory was temporarily depleted or the kernel malloc free lists were empty (see Section 2.4.5.3).

Each socket results in a network connection. If the system allocates an excessive number of sockets, use the netstat -an command to determine the state of your existing network connections (see Section 2.4.5.4).
For Internet servers, the majority of connections usually are in a TIME_WAIT state.

Use the netstat -p ip command to check for bad checksums, length problems, excessive redirects, and packets lost because of resource problems (see Section 2.4.5.5).

Use the netstat -p tcp command to check for retransmissions, out of order packets, and bad checksums (see Section 2.4.5.6).

Use the netstat -p udp command to check for bad checksums and full sockets (see Section 2.4.5.6).

Use the netstat -rs command to obtain routing statistics (see Section 2.4.5.7).

Use the netstat -s command to obtain display statistics related to the IP, ICMP, IGMP, TCP, and UDP protocol layers (see Section 2.4.5.8).

Most of the information provided by netstat is used to diagnose network hardware or software failures, not to identify tuning opportunities. See the Network Administration: Connections manual for more information on how to diagnose failures.

See netstat(1) for more information about the output produced by the various command options.

2.4.5.1 Input and Output Errors and Collisions

Network collisions are a normal part of network operations. A collision can occur when two or more Ethernet stations attempt to transmit simultaneously on the network. If a station is unable to access the network because another one is already using it, the station will stop trying to access the network for a short period of time, before attempting to access the network again. A collision occurs each time a station fails to access the network. Most network interface cards (NICs) will attempt to transmit a maximum of 15 times, after which they will drop the output packet and issue an excessive collisions error.

Use the output of the netstat -i command to check for input errors (Ierrs), output errors (Oerrs), and collisions (Coll). Compare the values in these fields with the total number of packets sent. High values may indicate a network problem. For example, cables may not be connected properly or the Ethernet may be saturated. A collision rate of up to 10 percent may not indicate a problem on a busy Ethernet. However, a collision rate of more than 20 percent could indicate a problem. For example:

# netstat -i
Name  Mtu  Network  Address             Ipkts Ierrs    Opkts  Oerrs  Coll
tu0   1500  Link    00:00:aa:11:0a:c1       0     0    43427  43427     0 
tu0   1500  DLI     none                    0     0    43427  43427     0 
tu1   1500  Link    bb:00:03:01:6c:4d  963447   138   902543  1118  80006 
tu1   1500  DLI     none               963447   138   902543  1118  80006 
tu1   1500  o-net   plume              963447   138   902543  1118  80006  
. 
. 
.

2.4.5.2 Device Driver Errors

Use the netstat -is command to check for network device driver errors. For example:


# netstat -is 
tu0 Ethernet counters at Tue Aug  3 13:57:35 2002
          191 seconds since last zeroed
     14624204 bytes received
      4749029 bytes sent
        34784 data blocks received
        11017 data blocks sent
      2197154 multicast bytes received
        17086 multicast blocks received
         1894 multicast bytes sent
           17 multicast blocks sent
          932 blocks sent, initially deferred
          347 blocks sent, single collision
          666 blocks sent, multiple collisions
            0 send failures
            0 collision detect check failure
            1 receive failures, reasons include: Frame too long
            0 unrecognized frame destination
            0 data overruns
            0 system buffer unavailable
            0 user buffer unavailable

The previous example shows that the system sent 11,017 blocks. Of those blocks, 1,013 (347 + 666) blocks had collisions, which represents approximately 10 percent of the blocks sent. A collision rate of up to 10 percent may not indicate a problem on a busy Ethernet. However, a collision rate of more then 20 percent could indicate a problem.

In addition, the following fields should be 0 or a low single-digit number:

send failures

receive failures

data overruns

system buffer unavailable

user buffer unavailable

2.4.5.3 Memory Usage

The netstat -m command shows statistics for network-related memory structures, including the memory that is being used for mbuf clusters. Use this command to determine if the network is using an excessive amount of memory in proportion to the total amount of memory installed in the system. If the netstat -m command shows several requests for memory (mbuf) clusters delayed or denied, this means that your system was temporarily short of physical memory. The following example is from a firewall server with 128 MB memory that does not have mbuf cluster compression enabled:


# netstat -m
 2521 Kbytes for small data mbufs (peak usage 9462 Kbytes)
78262 Kbytes for mbuf clusters (peak usage 97924 Kbytes)
 8730 Kbytes for sockets (peak usage 14120 Kbytes)
 9202 Kbytes for protocol control blocks (peak usage 14551
    2 Kbytes for routing table (peak usage 2 Kbytes)
    2 Kbytes for socket names (peak usage 4 Kbytes)
    4 Kbytes for packet headers (peak usage 32 Kbytes) 
39773 requests for mbufs denied
    0 calls to protocol drain routines
98727 Kbytes allocated to network

The previous example shows that 39,773 requests for memory were denied. This indicates a problem because this value should be 0. The example also shows that 78 MB of memory has been assigned to mbuf clusters, and that 98 MB of memory is being consumed by the network subsystem.

If you increase the value of the socket subsystem attribute sbcompress_threshold to 600, the memory allocated to the network subsystem immediately decreases to 18 MB, because compression at the kernel socket buffer interface results in a more efficient use of memory. See Section 6.2.3.3 for more information on the sbcompress_threshold attribute.

2.4.5.4 Socket Connections

Each socket results in a network connection. If the system allocates an excessive number of sockets, use the netstat -an command to determine the state of your existing network connections. The following example shows the contents of the protocol control block table and the number of TCP connections currently in each state:


# netstat -an | grep tcp | awk '{print $6}' | sort | uniq -c
    1 CLOSE_WAIT
   58 ESTABLISHED
   12 FIN_WAIT_1
    8 FIN_WAIT_2
   17 LISTEN
    1 SYN_RCVD
15749 TIME_WAIT
#

For Internet servers, the majority of connections usually are in a TIME_WAIT state. If the number of entries in the FIN_WAIT_1 and FIN_WAIT_2 fields represent a large percentage of the total connections (add together all of the fields), you may want to enable keepalive (see Section 6.3.2.5).

If the number of entries in the SYN_RCVD field represents a large percentage of the total connections, the server may be overloaded or experiencing TCP SYN attacks.

Note that in this example there are almost 16,000 sockets being used, which requires 16 MB of memory. See Section 6.1.2 for more information on configuring memory and swap space.

2.4.5.5 Dropped or Lost Packets

Use the netstat -p ip command to check for bad checksums, length problems, excessive redirects, and packets lost because of resource problems for the IP protocol. Check the output for a nonzero number in the lost packets due to resource problems field. For example:


# netstat -p ip
ip:
         259201001 total packets received
         0 bad header checksums
         0 with size smaller than minimum
         0 with data size < data length
         0 with header length < data size
         0 with data length < header length
         25794050 fragments received
         0 fragments dropped (duplicate or out of space)
         802 fragments dropped after timeout
         0 packets forwarded
         67381376 packets not forwardable
                 67381376 link-level broadcasts
         0 packets denied access
         0 redirects sent
         0 packets with unknown or unsupported protocol
         170988694 packets consumed here
         160039654 total packets generated here
         0 lost packets due to resource problems
         4964271 total packets reassembled ok
         2678389 output packets fragmented ok
         14229303 output fragments created
         0 packets with special flags set

Use the netstat -id command to monitor dropped output packets. Examine the output for a nonzero value in the Drop column. If a nonzero value appears in the Drop column for an interface, you may want to increase the value of the ifqmaxlen kernel variable to prevent dropped packets. See Section 6.3.2.9 for more information on this attribute.

The following example shows 4,221 dropped output packets on the tu1 network interface:

# netstat -id
Name Mtu  Network Address             Ipkts Ierrs  Opkts Oerrs  Coll  Drop
tu0 1500  Link    00:00:f8:06:0a:b1       0     0  98129 98129     0     0
tu0 1500  DLI     none                    0     0  98129 98129     0     0
tu1 1500  Link    aa:00:04:00:6a:4e  892390   785 814280 68031 93848  4221
tu1 1500  DLI     none               892390   785 814280 68031 93848  4221
tu1 1500  orange  flume              892390   785 814280 68031 93848  4221
  .
  .
  .

The output of the previous command shows that the Opkts and Oerrs fields have the same values for the tu0 interface, which indicates that the Ethernet cable is not connected. In addition, the value of the Oerrs field for the tu1 interface is 68,031, which is a high error rate. Use the netstat -is command to obtain detailed error information.

2.4.5.6 Retransmissions, Out-of-Order Packets, and Bad Checksums

Use the netstat -p tcp command to check for retransmissions, out-of-order packets, and bad checksums for the TCP protocol. Use the netstat -p udp command to look for bad checksums and full sockets for the UDP protocol. You can use the output of these commands to identify network performance problems by comparing the values in some fields to the total number of packets sent or received.

For example, an acceptable percentage of retransmitted packets or duplicate acknowledgments is 2 percent or less. An acceptable percentage of bad checksums is 1 percent or less.

In addition, a large number of entries in the embryonic connections dropped field may indicate that the listen queue is too small or server performance is slow and clients have canceled requests. Other important fields to examine include the completely duplicate packets, out-of-order packets, and discarded fields. For example:

# netstat -p tcp
tcp:
         66776579 packets sent
                 58018945 data packets (1773864027 bytes)
                 54447 data packets (132256902 bytes) retransmitted
                 5079202 ack-only packets (3354381 delayed)
                 29 URG only packets
                 7266 window probe packets
                 2322828 window update packets
                 1294022 control packets
         40166265 packets received
                 29455895 acks (for 1767211650 bytes)
                 719524 duplicate acks
                 0 acks for unsent data
                 19788741 packets (2952573297 bytes) received in-sequence
                 123726 completely duplicate packets (9224858 bytes)
                 2181 packets with some dup. data (67344 bytes duped)
                 472000 out-of-order packets (85613803 bytes)
                 1478 packets (926739 bytes) of data after window
                 43 window probes
                 201331 window update packets
                 1373 packets received after close
                 118 discarded for bad checksums
                 0 discarded for bad header offset fields
                 0 discarded because packet too short
         448388 connection requests
         431873 connection accepts
         765040 connections established (including accepts)
         896693 connections closed (including 14570 drops)
         86298 embryonic connections dropped
         25467050 segments updated rtt (of 25608120 attempts)
         106020 retransmit timeouts
                 145 connections dropped by rexmit timeout
         6329 persist timeouts
         37653 keepalive timeouts
                 15536 keepalive probes sent
                 16874 connections dropped by keepalive

The output of the previous command shows that, out of the 58,018,945 data packets that were sent, 54,447 packets were retransmitted, which is a percentage that is within the acceptable limit of 2 percent.

In addition, the command output shows that, out of the 29,455,895 acknowledgments, 719,524 were duplicates, which is a percentage that is slightly larger than the acceptable limit of 2 percent.

Important fields for the netstat -p udp command include the incomplete headers, bad data length fields, bad checksums, and full sockets fields, which should have low values. The no port field specifies the number of packets that arrived destined for a nonexistent port (for example, rwhod or routed broadcast packets) and were subsequently discarded. A large value for this field is normal and does not indicate a performance problem. For example:


# netstat -p udp
udp:
         144965408 packets sent
         217573986 packets received
         0 incomplete headers
         0 bad data length fields
         0 bad checksums
         5359 full sockets
         28001087 for no port (27996512 broadcasts, 0 multicasts)
         0 input packets missed pcb cache

The previous example shows a value of 5,359 in the full sockets field, which indicates that the UDP socket buffer may be too small.

2.4.5.7 Routing Statistics

Use the netstat -rs command to obtain routing statistics. The value of the bad routing redirects field should be small. A large value may indicate a serious network problem. For example:

# netstat -rs
routing:
         0 bad routing redirects
         0 dynamically created routes
         0 new gateways due to redirects
         1082 destinations found unreachable
         0 uses of a wildcard route

2.4.5.8 Protocol Statistics

Use the netstat -s command to simultaneously display statistics related to the IP, ICMP, IGMP, TCP, and UDP protocol layers. For example:

# netstat -s 
ip:
         377583120 total packets received
         0 bad header checksums
         7 with size smaller than minimum
         0 with data size < data length
         0 with header length < data size
         0 with data length < header length
         12975385 fragments received
         0 fragments dropped (dup or out of space)
         3997 fragments dropped after timeout
         523667 packets forwarded
         108432573 packets not forwardable
         0 packets denied access
         0 redirects sent
         0 packets with unknown or unsupported protocol
         259208056 packets consumed here
         213176626 total packets generated here
         581 lost packets due to resource problems
         3556589 total packets reassembled ok
         4484231 output packets fragmented ok
         18923658 output fragments created
         0 packets with special flags set 
icmp:
         4575 calls to icmp_error
         0 errors not generated because old ip message was too short
         0 errors not generated because old message was icmp
         Output histogram:
                 echo reply: 586585
                 destination unreachable: 4575
                 time stamp reply: 1
         0 messages with bad code fields
         0 messages < minimum length
         0 bad checksums
         0 messages with bad length
         Input histogram:
                 echo reply: 612979
                 destination unreachable: 147286
                 source quench: 10
                 echo: 586585
                 router advertisement: 91
                 time exceeded: 231
                 time stamp: 1
                 time stamp reply: 1
                 address mask request: 7
         586586 message responses generated 
igmp:
         0 messages received
         0 messages received with too few bytes
         0 messages received with bad checksum
         0 membership queries received
         0 membership queries received with invalid field(s)
         0 membership reports received
         0 membership reports received with invalid field(s)
         0 membership reports received for groups to which we belong
         0 membership reports sent 
tcp:
         66818923 packets sent
                 58058082 data packets (1804507309 bytes)
                 54448 data packets (132259102 bytes) retransmitted
                 5081656 ack-only packets (3356297 delayed)
                 29 URG only packets
                 7271 window probe packets
                 2323163 window update packets
                 1294434 control packets
         40195436 packets received
                 29477231 acks (for 1797854515 bytes)
                 719829 duplicate acks
                 0 acks for unsent data
                 19803825 packets (2954660057 bytes) received in-sequence
                 123763 completely duplicate packets (9225546 bytes)
                 2181 packets with some dup. data (67344 bytes duped)
                 472188 out-of-order packets (85660891 bytes)
                 1479 packets (926739 bytes) of data after window
                 43 window probes
                 201512 window update packets
                 1377 packets received after close
                 118 discarded for bad checksums
                 0 discarded for bad header offset fields
                 0 discarded because packet too short
         448558 connection requests
         431981 connection accepts
         765275 connections established (including accepts)
         896982 connections closed (including 14571 drops)
         86330 embryonic connections dropped
         25482179 segments updated rtt (of 25623298 attempts)
         106040 retransmit timeouts
                 145 connections dropped by rexmit timeout
         6329 persist timeouts
         37659 keepalive timeouts
                 15537 keepalive probes sent
                 16876 connections dropped by keepalive 
udp:
         145045792 packets sent
         217665429 packets received
         0 incomplete headers
         0 bad data length fields
         0 bad checksums
         5359 full sockets
         28004209 for no port (27999634 broadcasts, 0 multicasts)
         0 input packets missed pcb cache

2.4.6 Gathering NFS Server Side Information Using ps axlmp

On an NFS server system, the nfsd daemon spawns several I/O threads to service I/O requests from clients. A sufficient number of threads must be configured to handle the number of concurrent requests typical for the server. The default configuration of eight UDP and eight TCP threads is enough for a workstation exporting a small number of directories to a handful of clients. For a heavily used NFS server, up to 128 server threads can be configured and distributed over TCP and UDP. Monitor the NFS server threads on your server to determine if more threads are required to service the NFS load.

To display idle I/O threads on a server system, enter:


#  /usr/ucb/ps axlmp 0 | grep -v grep | grep -c nfs_udp

#  /usr/ucb/ps axlmp 0 | grep -v grep | grep -c nfs_tcp

This will display a count of the number of sleeping and idle UDP and TCP threads, respectively. If the number of sleeping and idle threads is zero or lower, you might improve NFS performance by increasing the number of threads. See Section 5.4.1 or nfsd(8) for more information.

2.4.7 Gathering NFS Client Side Information Using nfsiod

On an NFS client system, the nfsiod daemon spawns several I/O threads to service asynchronous I/O requests to the server. The I/O threads improve the performance of both NFS read and writes. The optimum number of I/O threads depends on many variables, such as how quickly the client will be writing, how many files will be accessed simultaneously, and the characteristics of the NFS server. For small clients, the default of seven threads is sufficient. For larger servers with heavy NFS client activity, more client threads may be necessary. Monitor the NFS client threads on your server to determine if more threads are required to service the NFS load.

To display idle I/O threads on a client system, enter:


# ps axlm | grep -v grep | grep -c nfsiod

This will display a count of the number of sleeping and idle I/O threads. If the number of sleeping and idle threads is often zero or lower, you might improve NFS performance by increasing the number of threads. See Section 5.4.2 or nfsiod(8) for more information.

2.4.8 Monitoring Incoming Network Traffic to an NFS Server by Using the nfswatch Command

The nfswatch program monitors all incoming network traffic to a NFS file server and divides it into several categories. The number and percentage of packets received in each category is displayed on the screen in a continuously updated display. The nfswatch command can usually be run without options and will produce useful results. For example:

#  /usr/sbin/nfswatch
Interval packets:            628 (network)          626 (to host)         0 (dropped)
Total packets:              6309 (network)         6307 (to host)         0 (dropped)
                              Monitoring packets from interface ee0
                      int   pct    total                      	   unt   pct    total
ND Read                0     0%       0     TCP Packets            0    0%        0
ND Write               0     0%       0     UDP Packets          139   10%      443
NFS Read               2     0%       2     ICMP Packets           1    0%        1
NFS Write              0     0%       0     Routing Control       81    6%      204
NFS Mount              2     0%       3     Address Resolution   109    8%      280
YP/NIS/NIS+            0     0%       0     Reverse Addr Resol    15    1%       43
RPC Authorization      7     1%      12     Ethernet/FDDI Bdcst  406   30%     1152
Other RPC Packets      4     0%      10     Other Packets       1087   80%     3352
 
											18 NFS Procedures [10 not displayed]              more->
 Procedure           int   pct   total  completed  avg(msec)  std dev   max resp
  CREATE               0    0%      0
  GETATTR              1   50%      1         1      3.90                3.90
  GETROOT              0    0%      0
  LINK                 0    0%      0
  LOOKUP               0    0%      0
  MKDIR                0    0%      0
  NULLPROC             0    0%      0
  READ                 0    0%      0

Note

The nfswatch command monitors and displays data for only file systems mounted with NFS Version 2.0.

Your kernel must be configured with the packetfilter option. After kernel configuration, any user can invoke nfswatch once the superuser has enabled promiscuous-mode operation using the pfconfig command. See packetfilter(7) for more information.

2.5 Additional Tools for Monitoring Performance

You may want to set up a routine to continuously monitor system performance. Some monitoring tools will alert you when serious problems occur (for example, mail). It is important that you choose a monitoring tool that has low overhead to obtain accurate performance information.

Table 2-1 describes the tools that you use to continuously monitor performance.

Table 2-1: Tools for Continuous Performance Monitoring

Name	Description
Performance Visualizer	Graphically displays the performance of all significant components of a parallel system. Using Performance Visualizer, you can monitor the performance of all the member systems in a cluster. It monitors the performance of several systems simultaneously, it allows you to see the impact of a parallel application on all the systems, and to ensure that the application is balanced across all systems. When problems are identified, you can change the application code and use Performance Visualizer to evaluate the impact of these changes. Performance Visualizer is a Tru64 UNIX layered product and requires a license. It also helps you identify overloaded systems, underutilized resources, active users, and busy processes. You can choose to look at all of the hosts in a parallel system or at individual hosts. See the Performance Visualizer documentation for more information.
`monitor`	Collects a variety of performance data on a running system and either displays the information or saves it to a binary file. The `monitor` utility is available on the Tru64 UNIX Freeware CD-ROM. See http://www.tru64unix.compaq.com/demos/ossc-v51a/html/monitor.htm for more information.
`top`	Provides continuous reports on the state of the system, including a list of the processes using the most CPU resources. The `top` command is available on the Tru64 UNIX Freeware CD-ROM. See ftp://eecs.nwu.edu/pub/top for more information.
`xload`	Displays the system load average in a histogram that is periodically updated. See `xload`(1X) for information.
`volstat`	Provides information about activity on volumes, plexes, subdisks, and disks under LSM control. The `volstat` utility reports statistics that reflect the activity levels of LSM objects since boot time or since you reset the statistics. See Section 9.3 for information.
`volwatch`	Monitors LSM for failures in disks, volumes, and plexes, and sends mail if a failure occurs. See Section 9.3 for information.

2.6 Gathering Profiling and Debugging Information

You can use profiling to identify sections of application code that consume large portions of execution time, and you can use these tools to profile and debug the kernel. To improve performance, concentrate on improving the coding efficiency of those time-intensive sections.

Table 2-2 describes the commands you can use to obtain information about applications. Detailed information about these tools is located in the Programmer's Guide and the Kernel Debugging manual.

In addition, prof_intro(1) provides an overview of application profilers, profiling, optimization, and performance analysis.

Table 2-2: Application Profiling and Debugging Tools

Name	Use	Description
`atom`	Profiles applications	Consists of a set of prepackaged tools (`third`, `hiprof`, or `pixie`) that can be used to instrument applications for profiling or debugging purposes. The `atom` toolkit also consists of a command interface and a collection of instrumentation routines that you can use to create custom tools for instrumenting applications. See the Programmer's Guide and `atom`(1) for more information.
`third`	Checks memory access and detects memory leaks in applications	Performs memory access checks and memory leak detection of C and C++ programs at run time, by using the `atom` tool to add code to executable and shared objects. The Third Degree tool instruments the entire program, including its referenced libraries. See `third`(1) for more information.
`hiprof`	Produces a profile of procedure execution times in an application	An `atom`-based program profiling tool that produces a flat profile, which shows the execution time spent in any given procedure, and a hierarchical profile, which shows the time spent in a given procedure and all of its descendents. The `hiprof` tool uses code instrumentation instead of program counter (PC) sampling to gather statistics. The `gprof` command is usually used to filter and merge output files and to format profile reports. See `hiprof`(1) for more information.
`pixie`	Profiles basic blocks in an application	Produces a profile showing the number of times each instruction was executed in a program. The information can be reported as tables or can be used to automatically direct later optimizations by using the `-feedback,` `-om,` or `-cord` options in the C compiler (see `cc`(1)). The `pixie` profiler reads an executable program, partitions it into basic blocks, and writes an equivalent program containing additional code that counts the execution of each basic block. The `pixie` utility also generates a file containing the address of each of the basic blocks. When you run this `pixie`-generated program, it generates a file containing the basic block counts. The `prof` and `pixstats` commands can analyze these files. See `pixie`(1) for more information.
`prof`	Analyzes profiling data	Analyzes profiling data and produces statistics showing which portions of code consume the most time and where the time is spent (for example, at the routine level, the basic block level, or the instruction level). The `prof` command uses as input one or more data files generated by the `kprofile`, `uprofile`, or `pixie` profiling tools. The `prof` command also accepts profiling data files generated by programs linked with the `-p` switch of compilers such as `cc`. The information produced by `prof` allows you to determine where to concentrate your efforts to optimize source code. See `prof`(1) for more information.
`gprof`	Analyzes profiling data and displays procedure call information and statistical program counter sampling in an application	Analyzes profiling data and allows you to determine which routines are called most frequently, and the source of the routine call, by gathering procedure call information and performing statistical program counter (PC) sampling. The `gprof` tool produces a flat profile of the routines' CPU usage. To produce a graphical execution profile of a program, the tool uses data from PC sampling profiles, which are produced by programs compiled with the `cc -pg` command, or from instrumented profiles, which are produced by programs modified by the `atom -tool hiprof` command. See `gprof`(1) for more information.
`uprofile`	Profiles user code in an application	Profiles user code using performance counters in the Alpha chip. The `uprofile` tool allows you to profile only the executable part of a program. The `uprofile` tool does not collect information on shared libraries. You process the performance data collected by the tool with the `prof` command. See the Kernel Debugging manual or `uprofile`(1) for more information.
`kprofile`	Produces a program counter profile of a running kernel	Profiles a running kernel using the performance counters on the Alpha chip. You analyze the performance data collected by the tool with the `prof` command. See `kprofile`(1) for more information.
Visual Threads	Identifies bottlenecks and performance problems in multithreaded applications	Enables you to analyze and refine your multithreaded applications. You can use Visual Threads to identify bottlenecks and performance problems, and to debug potential thread-related logic problems. Visual Threads uses rule-based analysis and statistics capabilities and visualization techniques. Visual Threads is licensed as part of the Developers' Toolkit for Tru64 UNIX.
`dbx`	Debugs running kernels, programs, and crash dumps, and examines and temporarily modifies kernel variables	Provides source-level debugging for C, Fortran, Pascal, assembly language, and machine code. The `dbx` debugger allows you to analyze crash dumps, trace problems in a program object at the source-code level or at the machine-code level, control program execution, trace program logic and flow of control, and monitor memory locations. Use `dbx` to debug kernels, debug stripped images, examine memory contents, debug multiple threads, analyze user code and applications, display the value and format of kernel data structures, and temporarily modify the values of some kernel variables. See `dbx`(8) for more information.
`kdbx`	Debugs running kernels and crash dumps	Allows you to examine a running kernel or a crash dump. The `kdbx` debugger, a frontend to the `dbx` debugger, is used specifically to debug kernel code and display kernel data in a readable format. The debugger is extensible and customizable, allowing you to create commands that are tailored to your kernel debugging needs. You can also use extensions to check resource usage (for example, CPU usage). See `kdbx`(8) for more information.
`ladebug`	Debugs kernels and applications	Debugs programs and the kernel and helps locate run-time programming errors. The `ladebug` symbolic debugger is an alternative to the `dbx` debugger and provides both command-line and graphical user interfaces and support for debugging multithreaded programs. See the Ladebug Debugger Manual and `ladebug`(1) for more information.
`lsof`	Displays open files	Displays information about files that are currently opened by the running processes. The `lsof` is available on the Tru64 UNIX Freeware CD-ROM.