You must gather a wide variety of performance information to identify performance problems or areas where performance is deficient.
Some symptoms or indications of performance problems are obvious. For example, applications complete slowly or messages appear on the console, indicating that the system is out of resources. Other problems or performance deficiencies are not obvious and can be detected only by monitoring system performance.
There are various commands and utilities that you can use to gather system performance information. It is important that you gather statistics under a variety of conditions. Comparing sets of data will help you diagnose performance problems.
For example, to determine how an application affects system performance, you can gather performance statistics without the application running, start the application, and then gather the same statistics. Comparing different sets of data will enable you to identify whether the application is consuming memory, CPU, or disk I/O resources.
In addition, you must gather information at different stages during the application processing to obtain accurate performance information. For example, an application may be I/O-intensive during one stage and CPU-intensive during another.
This chapter describes how to perform the following tasks:
Using a methodology approach to solve performance problems (Section 2.1)
Obtaining information about system events (Section 2.2)
Using the primary tools for gathering information (Section 2.3)
Using secondary tools to gather information (Section 2.4)
Continuously monitoring performance (Section 2.5)
After you identify a performance problem or an area in which performance
is deficient, you can identify an appropriate solution.
See
Part 2
for information about tuning by application, and see
Part 3
for information about tuning by component to improve system performance.
2.1 Methodology Approach to Solving Performance Problems
There are five recommended steps to diagnose a performance problem. Before you begin, you must become familiar with the terminology and concepts relating to performance and availability. See Chapter 1 for more information.
In addition, you must understand how your application utilizes system resources, because not all configurations and tuning guidelines are appropriate for all types of workloads. For example, you must determine if your applications are memory-intensive or CPU-intensive, or if they perform many disk or network operations. See Section 1.8 for information about identifying a resource model for your configuration.
To diagnose performance problems, follow these steps:
Before you begin, you must understand your system hardware
configuration.
To identify and manage your hardware components use thehwmgr
utility (see
Section 1.1
or
Section 2.3.1
for more information).
Run the
sys_check
utility, but before you
do so perform an analysis of the operating system parameters and kernel attributes
that tune the performance of your system.
This tool can be used to diagnose
performance problems.
See
Section 2.3.3
for more information.
Verify and know your software configuration errors.
You can
use
sys_check
to diagnose performance problems.
See
Section 2.2
for more information about obtaining information for
system events.
Determine what type of application you are using and categorize your application as an Oracle, Network File System, or internet server application. If you are tuning your system by applications, see the following chapters:
Tuning Oracle (Chapter 4)
Tuning Network File Systems (Chapter 5)
Tuning Internet Servers (Chapter 6)
Find the bottleneck or the system resource that is causing a performance degradation. Determine the performance problem by plotting the following information:
CPU Idle system time and user time
Memory Sum of active or inactive pages that are being used by processes, UBC and wired memory.
Disk I/O Transactions per second and blocks per second
Use the
collect
command to gather performance data
while the system is under load or manifesting the performance problem.
After
you gather the performance information, use the
collgui
graphical interface to plot the data.
For information on how to use
collgui, see
Section 2.3.2.2.
For more information about
identifying a resource model for your workload, see
Section 1.8.
2.2 Obtaining Information About System Events
Set up a routine to continuously monitor system events that will alert you when serious problems occur. Periodically examining event and log files allows you to correct a problem before it affects performance or availability, and helps you diagnose performance problems.
The system event logging facility and the binary event logging facility
log system events.
The system event logging facility uses the
syslog
function to log events in ASCII format.
The
syslogd
daemon collects the messages logged from the various kernel, command, utility,
and application programs.
This daemon then writes the messages to a local
file or forwards the messages to a remote system, as specified in the
/etc/syslog.conf
event logging configuration file.
Periodically
monitor these ASCII log files for performance information.
The binary event logging facility detects hardware and software events
in the kernel and logs detailed information in binary format records.
The
binary event logging facility uses the
binlogd
daemon to
collect various event log records.
The daemon then writes these records to
a local file or forwards the records to a remote system, as specified in the
/etc/binlog.conf
default configuration file.
You can examine the binary event log files by using the following methods:
The Event Manager (EVM) uses the binary log files to communicate event information to interested parties for immediate or later action. See Section 2.2.1 for more information about EVM.
DECevent
is a rules-based translation and reporting utility that provides event translation
for binary error log events.
EVM uses DECevent's translation facility,
dia, to translate binary error log events into human-readable form.
Compaq Analyze performs a similar role on some EV6 series processors.
For more information about DECevent, see
Section 2.2.2
or
dia(8)
For more information on Compaq Analyze, see
Section 2.2.3
or
ca(8)
In addition, we recommend that you configure crash dump support into the system. Significant performance problems may cause the system to crash, and crash dump analysis tools can help you diagnose performance problems.
See the
System Administration
manual for more information about event logging
and crash dumps.
2.2.1 Using Event Manager
Event Manager (EVM) allows you to obtain event information and communicate this information to interested parties for immediate or later action. Event Manager provides the following features:
Enables kernel-level and user-level processes and components to post events.
Enables event consumers, such as programs and users, to subscribe for notification when selected events occur.
Supports existing event channels such as the binary logger daemon.
Provides a graphical user interface (GUI) that enables users to review events.
Provides an application programming interface (API) library that enables programmers to write routines that post or subscribe to events.
Supports command-line utilities for administrators to configure and manage the EVM environment and for users to post or retrieve events.
See the
System Administration
manual for more information about EVM.
2.2.2 Using DECevent
The DECevent utility continuously monitors system events through the binary event logging facility, decodes events, and tracks the number and the severity of events logged by system devices. DECevent analyzes system events, attempts to isolate failing device components, and provides a notification mechanism (for example, mail) that can warn of potential problems.
You must register a license to use DECevent's analysis and notification features, or these features may also be available as part of your service agreement. A license is not needed to use DECevent to translate the binary log file to ASCII format.
See the
DECevent Translation and Reporting Utility
manual for more information.
2.2.3 Using Compaq Analyze
Compaq Analyze is a fault analysis utility designed to provide analysis for single error/fault events, and multiple event and complex analysis. Compaq Analyze provides system analysis that uses other error/fault data sources in addition to the traditional binary error log.
Compaq Analyze provides background automatic analysis by monitoring the active error log and processing events as they occur. The events in the error log file are checked against the analysis rules. If one or more of the events in the error log file meets the conditions specified in the rules, the analysis engine collects the error data and creates a problem report containing a description of the problem and any corrective actions required. Once the problem report is created, it is distributed in accordance with your notification preferences.
Note that recent Alpha EV6 processors are supported only by Compaq Analyze and not DECevent.
You can download the latest version of Compaq Analyze and other Web Based Enterprise Service Suite (WEBES) tools and documentation from the following location:
http://www.compaq.com/support/svctools/webes
Download the kit from the Web site, saving it to
/var/tmp/webes.
Unpack the kit using a command similar to the following:
#tar -xvf <tar file name>
Use the following command to install the Compaq Web Based Enterprise Service Suite:
#setld -l /var/temp/webes/kit
During the installation, you can safely select the default options.
However, you might not want to install all the optional WEBES tools.
Only
Compaq Analyze is used by EVM.
See the separate Compaq Analyze documentation
and
ca(8)2.2.4 Using System Accounting and Disk Quotas
Set up system accounting, which allows you to obtain information about the resources consumed by each user. Accounting can track the amount of CPU usage and connect time, the number of processes spawned, memory and disk usage, the number of I/O operations, and the number of print operations.
You should establish Advanced File System (AdvFS) and UNIX file system (UFS) disk quotas to track and control disk usage. Disk quotas allow you to limit the disk space available to users and to monitor disk space usage.
See the
System Administration
manual for information about system accounting
and UFS disk quotas.
See the
AdvFS Administration
manual for information about
AdvFS quotas.
2.3 Primary Tools for Gathering Information
The following utilities are the primary tools for gathering performance information:
hwmgr
utility (Section 2.3.1)
collect
utility (Section 2.3.2)
sys_check
utility (Section 2.3.3)
2.3.1 Gathering Hardware Information Using the hwmgr Utility
The
principal command that you use to manage hardware is the
hwmgr
command-line interface (CLI).
Other interfaces, such as the SysMan tasks,
provide a limited subset of the features provided by
hwmgr.
Using the
hwmgr
command enables you to connect to an unfamiliar
system, obtain information about its component hierarchy and allows you to
set attributes for specific components.
Use the
view
command to view the hierarchy of hardware
within a system.
This command enables you to find what adapters are controlling
devices, and discover where adapters are installed on buses.
The following
example shows the hardware component hierarchy on a small system that is not
part of a cluster:
#hwmgr view hierarchyHWID: Hardware component hierarchy ---------------------------------------------- 1: platform AlphaServer 800 5/500 2: cpu CPU0 4: bus pci0 5: scsi_adapter isp0 6: scsi_bus scsi0 18: disk bus-0-targ-0-lun-0 dsk0 19: disk bus-0-targ-4-lun-0 cdrom0 20: graphics_controller trio0 8: bus eisa0 9: serial_port tty00 10: serial_port tty01 11: parallel_port lp0 12: keyboard PCXAL 13: pointer PCXAS 14: fdi_controller fdi0 15: disk fdi0-unit-0 floppy0 16: network tu0 17: network tu1 output truncated
Some components might appear as multiple entries in the hierarchy. For example, if a disk is on a SCSI bus that is shared between two adapters, the hierarchy shows two entries for the same device. You can obtain similar views of the system hardware hierarchy by using the SysMan Station GUI. See the System Administration manual for information on running the SysMan Menu. Section 13.2.2.6 describes how to use the graphical interface. See the online help for more information on valid data entries.
To view a specific component in the hierarchy, use the
grep
command.
The following example shows output for the CPU hardware
component:
#hwmgr view hierarchy | grep "cpu"2: cpu qbb-0 CPU0 3: cpu qbb-0 CPU1 4: cpu qbb-0 CPU2 5: cpu qbb-0 CPU3 7: cpu qbb-1 CPU5 8: cpu qbb-1 CPU6 9: cpu qbb-1 CPU7 10: cpu qbb-2 CPU8 11: cpu qbb-2 CPU9 12: cpu qbb-2 CPU10 13: cpu qbb-2 CPU11
The
display hierarchy
command displays the currently
registered hardware components which have been placed in the system hierarchy.
Components that have a flagged status are identified in the command output
with the following codes:
(!) warning
(X) critical
(-) inactive
See
hwmgr(8)
To view all of the SCSI devices attached to the system (disks and tapes), use the following command:
#hwmgr show scsi
To view how many RAID array controllers can be seen from the host use the following command:
#hwmgr show scsi | grep scpSCSI DEVICE DEVICE DRIVER NUM DEVICE FIRST HWID: DEVICEID HOSTNAME TYPE SUBTYPE OWNER PATH FILE VALID PATH ------------------------------------------------------------------------- 266: 30 wf99 disk none 0 20 scp0 [2/0/7] 274: 38 wf99 disk none 0 20 scp1 [2/1/7] 282: 46 wf99 disk none 0 20 scp2 [2/2/7] 290: 54 wf99 disk none 0 20 scp3 [2/3/7] 298: 62 wf99 disk none 0 20 scp4 [2/4/7] 306: 70 wf99 disk none 0 20 scp5 [2/5/7] 314: 78 wf99 disk none 0 20 scp6 [2/6/7] 322: 86 wf99 disk none 0 20 scp7 [2/7/7] 330: 94 wf99 disk none 0 20 scp8 [2/8/7] 338: 102 wf99 disk none 0 20 scp9 [2/9/7] 346: 110 wf99 disk none 0 20 scp10 [2/10/7] 354: 118 wf99 disk none 0 20 scp11 [2/11/7]
The
scp
in the previous example represents the service
control port and is the address that a RAID array (HSG) presents itself for
administrative and diagnostic purposes.
For more information about the
hwmgr
command, see
the
Hardware Management
manual or
hwmgr(8)2.3.2 Gathering System Information by Using the collect Utility
The
collect
utility is a system monitoring tool that records or displays
specific operating system data.
It also gathers the vital system performance
information for specific subsystems, such as file systems, memory, disk, process
data, CPU, network, message queue, LSM, and others.
The
collect
utility creates minimal system overhead and is highly reliable.
It also provides
extensive and flexible switches to control data collection and playback.
You
can display data at the terminal, or store it in either a compressed or uncompressed
data file.
Data files can be read and manipulated from the command line.
To ensure that the
collect
utility delivers reliable
statistics, it locks itself into memory using the page-locking function
plock(), and by default cannot be swapped out by the system.
It also
raises its priority using the priority function
nice().
However, these measures should not have any impact on a system under normal
load, and they should have only a minimal impact on a system under extremely
high load.
You can invoke the
collect
utility from the
collgui
graphical user interface or from the command line.
If you
are using the graphic user interface, run
cfilt
on the
command line to filter
collect's data used by
collgui
and user scripts.
For more information see
collect(8)
The following example shows how to run a full data collection and display the output at the terminal using the standard interval of 10 seconds:
#/usr/sbin/collect
This command is similar to the output monitoring commands such as
vmstat(1)iostat(1)netstat(1)volstat(8)
Use the -s option to select subsystems for inclusion in the data collection, or use the -e (exclude) option to exclude subsystems from the data collection.
The following output specifies only data from the file system subsystem:
#/usr/sbin/collect -sf# FileSystem Statistics # FS Filesystem Capacity Free 0 root_domain#root 128 30 1 usr_domain#usr 700 147 3 usr_domain#var 700 147
The option letters map to the following subsystems:
p Specifies the process data
m Specifies the memory data
d Specifies the disk data
l Specifies the LSM volume data
n Specifies the network data
c Specifies the CPU data
f Specifies the file system data
tyy Specifies the terminal data
When you are collecting process data, use the -S (sort) and -n X (number) options to sort data by percentage of CPU usage and to save only X processes. Target-specific processes using the Plist option, where list is a list of process identifiers, comma-separated with blanks.
If there are many (greater than 100) disks connected to the system being monitored, use the -D option to monitor a particular set of disks.
Use the
collect
utility with the
-p
option to read multiple binary data files and play them back as one stream,
with monotonically increasing sample numbers.
You can also combine multiple
binary input files into one binary output file, using the
-p
option with the input files and the
-f
option with the output
file.
The
collect
utility will combine input files in whatever
order you specify on the command line.
This means that the input files must
be in strict chronological order if you want to do further processing of the
combined output file.
You can also combine binary input files from different
systems, made at different times, with differing subsets of subsystems for
which data has been collected.
Filtering options such as
-e,
-s,
-P, and
-D
can be used with this
utility.
See
collect(8)2.3.2.1 Configuring collect to Automatically Start on System Reboot
You can configure
collect
to automatically start when the system reboots.
This is particularly
useful for continuous monitoring of subsystems and processes.
It is essential
for diagnosing problems and performance issues.
On each system, use the
rcmgr
command with the
set
operation to configure
the following values in the
/etc/rc.config*
file.
For
example:
%rcmgr set COLLECT_AUTORUN 1
A value of 1 sets
collect
to automatically start
when the system reboots.
A value of 0 (the default) causes
collect
to not start on reboot:
%rcmgr set COLLECT_ARGS " -ol -i 10:60 \ -f /var/adm/collect.dated/collect -H d0:5,1w "
A null value causes
collect
to start with the following
default values:
-i60, 120 -f /var/adm/collect.dated -W 1h -M 10, 15
Direct output from
collect
should be written to a
local file system, not a NFS-mounted file system, to prevent important diagnostic
data from being lost during system or network problems and to prevent any
consequent system problems arising from
collect
output
being blocked by a nonresponsive file system.
See
rcmgr(8)2.3.2.2 Plotting collect Datafiles
Use either
collgui
(a
graphical interface for the
collect
command) or
cflit
(a filter for the
collect
command) to export
collect
datafiles to Excel.
Note
To run
collgui, you need Perl and Perl/TK. They are freely downloadable from thecollectFTP site: ftp://ftp.digital.com/pub/DEC/collect
To plot information using the
collgui
graphical interface,
follow these steps:
Run
collgui
in debug mode:
>> collgui -d "collect datafile"
Select the desired subsystem and click on Display.
Return to the shell that
collgui
was started
in.
You will see that
collgui
has created the
/var/tmp
directory.
The file name is
collgui.xxxx,
where
xxxx
are integers.
The data file (collgui.xxxx) is exportable to Excel.
Copy it to a Windows system.
On your Windows system, start Excel and open
collgui.xxxx.
You might have to change files of Type: field to "All files
(*.*)".
In Excel 2000, a text import wizard will pop up.
In Excel 2000, select data type Delimited, then select Next.
In Excel 2000, select Tabs and Spaces as Delimiters, then select Next.
In the Data Preview pane, select the columns that you want to import using the Shift key select Finish.
You should now see the columns displayed in your worksheet.
To plot information using the
cflit
collect filter,
follow these steps:
Use
cfilt
to generate the data file.
For
example, if you choose to display the
system time,
physical memory
used for the process data, and
user+system
time, wait time for the Single CPU field, enter the following command:
cfilt -f "collect datafile" 'sin:WAIT:USER+SYS' 'pro:Systim#:RSS#' > /var/tmp/collgui.xxxx
Copy the
collgui
data file to your Windows system.
Follow steps 4-9 in the
collgui
previous
procedure.
For more information, see the following Web site:
http://www.tru64unix.compaq.com/collect/collect_faq.html
2.3.3 Checking the Configuration by Using the sys_check Utility
The
sys_check
utility performs an analysis of operating system parameters
and kernel attributes that tune the performance of your system.
The utility
checks memory and CPU resources, provides performance data and lock statistics
for SMP systems and for kernel profiles, and outputs any warnings and tuning
guidelines.
The
sys_check
utility creates an HTML file that describes
the system configuration, and can be used to diagnose problems.
The report
generated by
sys_check
provides warnings if it detects
problems with any current settings.
Use
sys_check
utility
in conjunction with the event management and system monitoring tools to provide
a complete overview and control of system status.
Consider applying the
sys_check
utility's configuration
and tuning guidelines before applying any advanced tuning guidelines.
Note
You may experience impaired system performance while running the
sys_checkutility. Invoke the utility during offpeak hours to minimize the performance impact.
You can invoke the
sys_check
utility from the SysMan
graphical user interface or from the command line.
If you specify
sys_check
without any command-line options, it performs a basic
system analysis and creates an HTML file with configuration and tuning guidelines.
Options that you can specify at the command line include:
The
-all
option provides information about
all subsystems, including security information and
setld
inventory verification.
The
-perf
option provides only performance
data and excludes configuration data.
This may take 5 to 10 minutes to complete.
The
-escalate
option creates escalation
files required for reporting problems.
See
sys_check(8)2.4 Secondary Tools for Gathering Information
The following utilities are the secondary tools used to gather performance
information:
Gathering system information:
lockinfo
utility (Section 2.4.1)
sched_stat
utility (Section 2.4.2)
Gathering network information:
nfsstat
utility (Section 2.4.3)
tcpdump
utility (Section 2.4.4)
netstat
command (Section 2.4.5)
ps axlmp
command (Section 2.4.6)
nfsiod
daemon (Section 2.4.7)
nfswatch
command (Section 2.4.8)
2.4.1 Gathering Locking Statistics by Using the lockinfo Utility
The
lockinfo
utility collects and displays locking statistics for the
kernel SMP locks.
It uses the
/dev/lockdev
pseudodriver
to collect data.
Locking statistics can be gathered when the
lockmode
attribute for the
generic
subsystem is set to
2 (the default), 3, or 4.
To gather statistics with
lockinfo, follow these
steps:
Start up a system workload and wait for it to get to a steady state.
Start
lockinfo
with
sleep
as the specified command and some number of seconds as the specified
cmd_args.
This causes
lockinfo
to gather
statistics for the length of time it takes the
sleep
command
to execute.
Based on the first set of results, use
lockinfo
again to request more specific information about any lock class that shows
results, such as a large percentage of misses, which is likely to cause a
system performance problem.
The following example shows how to gather locking statistics for each processor over a period of 60 seconds:
#lockinfo -percpu sleep 60hostname: sysname.node.corp.com lockmode: 4 (SMP DEBUG with kernel mode preemption enabled) processors: 4 start time: Wed Jun 9 14:45:08 1999 end time: Wed Jun 9 14:46:08 1999 command: sleep 60 tries reads trmax misses percent sleeps waitmax waitsum misses seconds seconds bsBuf.bufLock (S) 0 1400786 0 45745 47030 3.4 0 0.00007 0.15526 1 1415828 0 45367 47538 3.4 0 0.00006 0.15732 2 1399462 0 33076 48507 3.5 0 0.00005 0.15907 3 1398336 0 31753 48867 3.5 0 0.00005 0.15934 ----------------------------------------------------------------------- ALL 5614412 0 45745 191942 3.4 0 0.00007 0.63099 lock.l_lock (S) 0 1360769 0 40985 18460 1.4 0 0.00005 0.04041 1 1375384 0 20720 18581 1.4 0 0.00005 0.04124 2 1375122 0 20657 18831 1.4 0 0.00009 0.04198 ----------------------------------------------------------------------- ALL 5483049 0 40985 74688 1.4 0 0.00009 0.16526 ...inifaddr_lock (C) 0 0 0 1 0 0.0 0 0.00000 0.00000 1 1 1 1 0 0.0 0 0.00000 0.00000 2 0 0 1 0 0.0 0 0.00000 0.00000 3 0 0 1 0 0.0 0 0.00000 0.00000 ----------------------------------------------------------------------- ALL 1 1 1 0 0.0 0 0.00000 0.00000 total simple_locks = 28100338 percent unknown = 0.0 total rws_locks = 1466 percent reads = 100.0 total complex_locks = 2716146 percent reads = 33.2 percent unknown = 0.0
A locking problem is simply an indication that there is high contention for a certain type of resource. If contention exists for a lock related to I/O, and a particular application is spawning many processes that compete for the same files and directories, application or database storage design adjustments might be in order.
Applications that use System V semaphores can sometimes encounter locking contention if they create a very large number of semaphores in a single semaphore set because the kernel uses locks on each set of semaphores. In this case, performance improvements might be realized by changing the application to use more semaphore sets, each with a smaller number of semaphores.
See
lockinfo(8)2.4.2 Gathering CPU Usage and Process Statistics by Using the sched_stat Utility
The
sched_stat
utility helps determine how well the system load is distributed
among CPUs, what kinds of jobs are getting (or not getting) enough cycles
on each CPU, and how well cache affinity is being maintained for these jobs.
The
sched_stat
displays CPU usage and process-scheduling
for SMP and NUMA platforms.
To gather statistics with
sched_stat, follow these
steps:
Start up a system workload and wait for it to get to a steady state.
Start
sched_stat
with
sleep
as the specified command and some number of seconds as the specified
cmd_arg.
This causes
sched_stat
to gather
statistics for the length of time it takes the
sleep
command
to execute.
For example, the following command causes
sched_stat
to collect statistics for 60 seconds and then print a report:
#/usr/sbin/sched_stat sleep 60
If you include options on the command line, only statistics for the
specified options are reported.
If you specify the command without any options,
all options except for
-R
are assumed.
See
sched_stat(8)2.4.3 Displaying Network and NFS Statistics by Using the nfsstat Utility
To display or reinitialize NFS and remote procedure call (RPC) statistics
for clients and servers, including the number of packets that had to be retransmitted
(retrans) and the number of times a reply transaction ID
did not match the request transaction ID (badxid), enter:
#/usr/ucb/nfsstat
Information similar to the following is displayed:
Server rpc: calls badcalls nullrecv badlen xdrcall 38903 0 0 0 0 Server nfs: calls badcalls 38903 0 Server nfs V2: null getattr setattr root lookup readlink read 5 0% 3345 8% 61 0% 0 0% 5902 15% 250 0% 1497 3% wrcache write create remove rename link symlink 0 0% 1400 3% 549 1% 1049 2% 352 0% 250 0% 250 0% mkdir rmdir readdir statfs 171 0% 172 0% 689 1% 1751 4% Server nfs V3: null getattr setattr lookup access readlink read 0 0% 1333 3% 1019 2% 5196 13% 238 0% 400 1% 2816 7% write create mkdir symlink mknod remove rmdir 2560 6% 752 1% 140 0% 400 1% 0 0% 1352 3% 140 0% rename link readdir readdir+ fsstat fsinfo pathconf 200 0% 200 0% 936 2% 0 0% 3504 9% 3 0% 0 0% commit 21 0% Client rpc: calls badcalls retrans badxid timeout wait newcred 27989 1 0 0 1 0 0 badverfs timers 0 4 Client nfs: calls badcalls nclget nclsleep 27988 0 27988 0 Client nfs V2: null getattr setattr root lookup readlink read 0 0% 3414 12% 61 0% 0 0% 5973 21% 257 0% 1503 5% wrcache write create remove rename link symlink 0 0% 1400 5% 549 1% 1049 3% 352 1% 250 0% 250 0% mkdir rmdir readdir statfs 171 0% 171 0% 713 2% 1756 6% Client nfs V3: null getattr setattr lookup access readlink read 0 0% 666 2% 9 0% 2598 9% 137 0% 200 0% 1408 5% write create mkdir symlink mknod remove rmdir 1280 4% 376 1% 70 0% 200 0% 0 0% 676 2% 70 0% rename link readdir readdir+ fsstat fsinfo pathconf 100 0% 100 0% 468 1% 0 0% 1750 6% 1 0% 0 0% commit 10 0%
The ratio of timeouts to calls (which should not exceed 1 percent) is the most important thing to look for in the NFS statistics. A timeout-to-call ratio greater than 1 percent can have a significant negative impact on performance. See Chapter 10 for information on how to tune your system to avoid timeouts.
To display NFS and RPC information in intervals (seconds), enter:
#/usr/ucb/nfsstat -s -i number
The following example displays NFS and RPC information in 10-second intervals:
#/usr/ucb/nfsstat -s -i 10
If you are monitoring an experimental situation with
nfsstat, reset the NFS counters to 0 before you begin the experiment.
To
reset counters to 0, enter:
#/usr/ucb/nfsstat -z
See
nfsstat(8)2.4.4 Gathering Information by Using the tcpdump Utility
The
tcpdump
utility monitors
and displays packet headers on a network interface.
You can specify the interface
on which to listen, the direction of the packet transfer, or the type of protocol
traffic to display.
The
tcpdump
command allows you to monitor the network
traffic associated with a particular network service and to identify the source
of a packet.
It lets you determine whether requests are being received or
acknowledged, or in the case of slow network performance, to determine the
source of network requests
Your kernel must be configured with the
packetfilter
option to use the command.
For example:
#pfconfig +p +c tu0
The
netstat -ni
command displays the configured network
interfaces.
For example:
#netstat -niName Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll tu0 1500 <Link> 00:00f8:22:f8:05 486404139 010939748 632583736 tu0 1500 16.140.48/24 16.140.48.156 486404139 010939748 632583736 tu0 1500 DLI none 486404139 010939748 632583736 sl0* 296 <Link> 0 0 0 0 0 lo0 4096 <Link> 1001631086 0 1001631086 0 0 lo0 4096 127/8 127.0.0.1 1001631086 0 1001631086 0 0
Use the
netstat
command output to determine which
interface to use with the
tcpdump
command.
For example:
#tcpdump -mvi tu0 -Nts1500tcpdump: listening on tu0 Using kernel BPF filter k-1.fc77a110 > foo.pmap-v2: 56 call getport prog "nfs" V3 prot UDP port 0 \ (ttl 30, id 20054) foo.fc77a110 > k-1.pmap-v2: 28 reply getport 2049 (ttl 30, id 36169) k-1.fd77a110 > foo.pmap-v2: 56 call getport prog "mount" V3 prot UDP port 0 \ (ttl 30, id 20057) foo.fd77a110 > k-1.pmap-v2: 28 reply getport 1030 (ttl 30, id 36170) k-1.fe77a110 > foo.mount-v3: 112 call mount "/pns2" (ttl 30, id 20062) foo.fe77a110 > k-1.mount-v3: 68 reply mount OSF/1 fh 19,17/1.4154.1027680688/4154.\ 1027680688 (DF) (ttl 30, id 36171) k-1.b81097eb > fubar.nfs-v3: 136 call fsinfo OSF/1 fh 19,17/1.4154.1027680688/4154.\ 1027680688 (ttl 30, id 20067)
The -s snaplen option displays snaplen bytes of data from each packet rather than the default of 68. The default is adequate for IP, ICP, TCP, and UDP, but 500-1500 bytes is recommended for NFS and RPC adequate results.
See
tcpdump(8)packetfilter(7)2.4.5 Monitoring Network Statistics by Using the netstat Command
To check network
statistics, use the
netstat
command.
Some problems to look
for are:
If the
netstat -i
command shows excessive
amounts of input errors (Ierrs), output errors (Oerrs), or collisions (Coll), this may indicate
a network problem; for example, cables are not connected properly or the Ethernet
is saturated (see
Section 2.4.5.1).
Use the
netstat -is
command to check for
network device driver errors (see
Section 2.4.5.2).
Use the
netstat -m
command to determine
if the network is using an excessive amount of memory in proportion to the
total amount of memory installed in the system.
If the
netstat -m
command shows several requests
for memory delayed or denied, this means that either physical memory was temporarily
depleted or the kernel
malloc
free lists were empty (see
Section 2.4.5.3).
Each socket results in a network connection.
If the system
allocates an excessive number of sockets, use the
netstat -an
command to determine the state of your existing network connections (see
Section 2.4.5.4).
For Internet servers, the majority of connections usually are in a
TIME_WAIT
state.
Use the
netstat -p ip
command to check
for bad checksums, length problems, excessive redirects, and packets lost
because of resource problems (see
Section 2.4.5.5).
Use the
netstat -p tcp
command to check
for retransmissions, out of order packets, and bad checksums (see
Section 2.4.5.6).
Use the
netstat -p udp
command to check
for bad checksums and full sockets (see
Section 2.4.5.6).
Use the
netstat -rs
command to obtain routing
statistics (see
Section 2.4.5.7).
Use the
netstat -s
command to obtain display
statistics related to the IP, ICMP, IGMP, TCP, and UDP protocol layers (see
Section 2.4.5.8).
Most of the information provided by
netstat
is used
to diagnose network hardware or software failures, not to identify tuning
opportunities.
See the
Network Administration: Connections
manual for more information on
how to diagnose failures.
See
netstat(1)2.4.5.1 Input and Output Errors and Collisions
Network collisions are a normal
part of network operations.
A collision can occur when two or more Ethernet
stations attempt to transmit simultaneously on the network.
If a station is
unable to access the network because another one is already using it, the
station will stop trying to access the network for a short period of time,
before attempting to access the network again.
A collision occurs each time
a station fails to access the network.
Most network interface cards (NICs)
will attempt to transmit a maximum of 15 times, after which they will drop
the output packet and issue an
excessive collisions
error.
Use the output of the
netstat -i
command to check
for input errors (Ierrs), output errors (Oerrs), and collisions (Coll).
Compare the values
in these fields with the total number of packets sent.
High values may indicate
a network problem.
For example, cables may not be connected properly or the
Ethernet may be saturated.
A collision rate of up to 10 percent may not indicate
a problem on a busy Ethernet.
However, a collision rate of more than 20 percent
could indicate a problem.
For example:
#netstat -iName Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll tu0 1500 Link 00:00:aa:11:0a:c1 0 0 43427 43427 0 tu0 1500 DLI none 0 0 43427 43427 0 tu1 1500 Link bb:00:03:01:6c:4d 963447 138 902543 1118 80006 tu1 1500 DLI none 963447 138 902543 1118 80006 tu1 1500 o-net plume 963447 138 902543 1118 80006 . . .
Use the
netstat -is
command
to check for network device driver errors.
For example:
#netstat -istu0 Ethernet counters at Tue Aug 3 13:57:35 2002 191 seconds since last zeroed 14624204 bytes received 4749029 bytes sent 34784 data blocks received 11017 data blocks sent 2197154 multicast bytes received 17086 multicast blocks received 1894 multicast bytes sent 17 multicast blocks sent 932 blocks sent, initially deferred 347 blocks sent, single collision 666 blocks sent, multiple collisions 0 send failures 0 collision detect check failure 1 receive failures, reasons include: Frame too long 0 unrecognized frame destination 0 data overruns 0 system buffer unavailable 0 user buffer unavailable
The previous example shows that the system sent 11,017 blocks. Of those blocks, 1,013 (347 + 666) blocks had collisions, which represents approximately 10 percent of the blocks sent. A collision rate of up to 10 percent may not indicate a problem on a busy Ethernet. However, a collision rate of more then 20 percent could indicate a problem.
In addition, the following fields should be 0 or a low single-digit number:
send failures
receive failures
data overruns
system buffer unavailable
user buffer unavailable
The
netstat -m
command shows statistics
for network-related memory structures, including the memory that is being
used for
mbuf
clusters.
Use this command to determine if
the network is using an excessive amount of memory in proportion to the total
amount of memory installed in the system.
If the
netstat -m
command shows several requests for memory (mbuf) clusters
delayed or denied, this means that your system was temporarily short of physical
memory.
The following example is from a firewall server with 128 MB memory
that does not have
mbuf
cluster compression enabled:
#netstat -m2521 Kbytes for small data mbufs (peak usage 9462 Kbytes) 78262 Kbytes for mbuf clusters (peak usage 97924 Kbytes) 8730 Kbytes for sockets (peak usage 14120 Kbytes) 9202 Kbytes for protocol control blocks (peak usage 14551 2 Kbytes for routing table (peak usage 2 Kbytes) 2 Kbytes for socket names (peak usage 4 Kbytes) 4 Kbytes for packet headers (peak usage 32 Kbytes) 39773 requests for mbufs denied 0 calls to protocol drain routines 98727 Kbytes allocated to network
The previous example shows
that 39,773 requests for memory were denied.
This indicates a problem because
this value should be 0.
The example also shows that 78 MB of memory has been
assigned to
mbuf
clusters, and that 98 MB of memory is
being consumed by the network subsystem.
If you increase the value of the
socket
subsystem
attribute
sbcompress_threshold
to 600, the memory allocated
to the network subsystem immediately decreases to 18 MB, because compression
at the kernel socket buffer interface results in a more efficient use of memory.
See
Section 6.2.3.3
for more information on the
sbcompress_threshold
attribute.
2.4.5.4 Socket Connections
Each socket results in a network connection.
If the
system allocates an excessive number of sockets, use the
netstat
-an
command to determine the state of your existing network connections.
The following example shows the contents of the protocol control block table
and the number of TCP connections currently in each state:
#netstat -an | grep tcp | awk '{print $6}' | sort | uniq -c1 CLOSE_WAIT 58 ESTABLISHED 12 FIN_WAIT_1 8 FIN_WAIT_2 17 LISTEN 1 SYN_RCVD 15749 TIME_WAIT #
For Internet servers, the majority of connections usually
are in a
TIME_WAIT
state.
If the number of entries in the
FIN_WAIT_1
and
FIN_WAIT_2
fields represent a
large percentage of the total connections (add together all of the fields),
you may want to enable
keepalive
(see
Section 6.3.2.5).
If the number of entries in the
SYN_RCVD
field represents
a large percentage of the total connections, the server may be overloaded
or experiencing TCP SYN attacks.
Note that in this example there are almost 16,000 sockets being used,
which requires 16 MB of memory.
See
Section 6.1.2
for more information
on configuring memory and swap space.
2.4.5.5 Dropped or Lost Packets
Use the
netstat -p ip
command
to check for bad checksums, length problems, excessive redirects, and packets
lost because of resource problems for the IP protocol.
Check the output for
a nonzero number in the lost packets due to resource problems field.
For example:
#netstat -p ipip: 259201001 total packets received 0 bad header checksums 0 with size smaller than minimum 0 with data size < data length 0 with header length < data size 0 with data length < header length 25794050 fragments received 0 fragments dropped (duplicate or out of space) 802 fragments dropped after timeout 0 packets forwarded 67381376 packets not forwardable 67381376 link-level broadcasts 0 packets denied access 0 redirects sent 0 packets with unknown or unsupported protocol 170988694 packets consumed here 160039654 total packets generated here 0 lost packets due to resource problems 4964271 total packets reassembled ok 2678389 output packets fragmented ok 14229303 output fragments created 0 packets with special flags set
Use the
netstat -id
command to monitor dropped output packets.
Examine the
output for a nonzero value in the
Drop
column.
If a nonzero
value appears in the
Drop
column for an interface, you
may want to increase the value of the
ifqmaxlen
kernel
variable to prevent dropped packets.
See
Section 6.3.2.9
for more
information on this attribute.
The following example shows 4,221 dropped output packets on the
tu1
network interface:
#netstat -idName Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll Drop tu0 1500 Link 00:00:f8:06:0a:b1 0 0 98129 98129 0 0 tu0 1500 DLI none 0 0 98129 98129 0 0 tu1 1500 Link aa:00:04:00:6a:4e 892390 785 814280 68031 93848 4221 tu1 1500 DLI none 892390 785 814280 68031 93848 4221 tu1 1500 orange flume 892390 785 814280 68031 93848 4221 . . .
The output of the previous command shows that the
Opkts
and
Oerrs
fields have the same values for
the
tu0
interface, which indicates that the Ethernet cable
is not connected.
In addition, the value of the
Oerrs
field
for the
tu1
interface is 68,031, which is a high error
rate.
Use the
netstat -is
command to obtain detailed error
information.
2.4.5.6 Retransmissions, Out-of-Order Packets, and Bad Checksums
Use the
netstat -p tcp
command to check for retransmissions, out-of-order
packets, and bad checksums for the TCP protocol.
Use the
netstat
-p udp
command to look for bad checksums and full sockets for the
UDP protocol.
You can use the output of these commands to identify network
performance problems by comparing the values in some fields to the total number
of packets sent or received.
For example, an acceptable percentage of retransmitted packets or duplicate acknowledgments is 2 percent or less. An acceptable percentage of bad checksums is 1 percent or less.
In addition, a large number of entries in the
embryonic connections
dropped
field may indicate that the listen queue is too small or
server performance is slow and clients have canceled requests.
Other important
fields to examine include the
completely duplicate packets,
out-of-order packets, and
discarded
fields.
For
example:
#netstat -p tcptcp: 66776579 packets sent 58018945 data packets (1773864027 bytes) 54447 data packets (132256902 bytes) retransmitted 5079202 ack-only packets (3354381 delayed) 29 URG only packets 7266 window probe packets 2322828 window update packets 1294022 control packets 40166265 packets received 29455895 acks (for 1767211650 bytes) 719524 duplicate acks 0 acks for unsent data 19788741 packets (2952573297 bytes) received in-sequence 123726 completely duplicate packets (9224858 bytes) 2181 packets with some dup. data (67344 bytes duped) 472000 out-of-order packets (85613803 bytes) 1478 packets (926739 bytes) of data after window 43 window probes 201331 window update packets 1373 packets received after close 118 discarded for bad checksums 0 discarded for bad header offset fields 0 discarded because packet too short 448388 connection requests 431873 connection accepts 765040 connections established (including accepts) 896693 connections closed (including 14570 drops) 86298 embryonic connections dropped 25467050 segments updated rtt (of 25608120 attempts) 106020 retransmit timeouts 145 connections dropped by rexmit timeout 6329 persist timeouts 37653 keepalive timeouts 15536 keepalive probes sent 16874 connections dropped by keepalive
The output of the previous command shows that, out of the 58,018,945 data packets that were sent, 54,447 packets were retransmitted, which is a percentage that is within the acceptable limit of 2 percent.
In addition, the command output shows that, out of the 29,455,895 acknowledgments, 719,524 were duplicates, which is a percentage that is slightly larger than the acceptable limit of 2 percent.
Important fields for the
netstat -p udp
command include
the
incomplete headers,
bad data length fields,
bad checksums, and
full sockets
fields, which should have low values.
The
no port
field
specifies the number of packets that arrived destined for a nonexistent port
(for example,
rwhod
or
routed
broadcast
packets) and were subsequently discarded.
A large value for this field is
normal and does not indicate a performance problem.
For example:
#netstat -p udpudp: 144965408 packets sent 217573986 packets received 0 incomplete headers 0 bad data length fields 0 bad checksums 5359 full sockets 28001087 for no port (27996512 broadcasts, 0 multicasts) 0 input packets missed pcb cache
The previous example
shows a value of 5,359 in the
full sockets
field, which
indicates that the UDP socket buffer may be too small.
2.4.5.7 Routing Statistics
Use the
netstat -rs
command to
obtain routing statistics.
The value of the
bad routing redirects
field should be small.
A large value may indicate a serious network
problem.
For example:
#netstat -rsrouting: 0 bad routing redirects 0 dynamically created routes 0 new gateways due to redirects 1082 destinations found unreachable 0 uses of a wildcard route
Use the
netstat -s
command to simultaneously
display statistics related to the IP, ICMP, IGMP, TCP, and UDP protocol layers.
For example:
#netstat -sip: 377583120 total packets received 0 bad header checksums 7 with size smaller than minimum 0 with data size < data length 0 with header length < data size 0 with data length < header length 12975385 fragments received 0 fragments dropped (dup or out of space) 3997 fragments dropped after timeout 523667 packets forwarded 108432573 packets not forwardable 0 packets denied access 0 redirects sent 0 packets with unknown or unsupported protocol 259208056 packets consumed here 213176626 total packets generated here 581 lost packets due to resource problems 3556589 total packets reassembled ok 4484231 output packets fragmented ok 18923658 output fragments created 0 packets with special flags set icmp: 4575 calls to icmp_error 0 errors not generated because old ip message was too short 0 errors not generated because old message was icmp Output histogram: echo reply: 586585 destination unreachable: 4575 time stamp reply: 1 0 messages with bad code fields 0 messages < minimum length 0 bad checksums 0 messages with bad length Input histogram: echo reply: 612979 destination unreachable: 147286 source quench: 10 echo: 586585 router advertisement: 91 time exceeded: 231 time stamp: 1 time stamp reply: 1 address mask request: 7 586586 message responses generated igmp: 0 messages received 0 messages received with too few bytes 0 messages received with bad checksum 0 membership queries received 0 membership queries received with invalid field(s) 0 membership reports received 0 membership reports received with invalid field(s) 0 membership reports received for groups to which we belong 0 membership reports sent tcp: 66818923 packets sent 58058082 data packets (1804507309 bytes) 54448 data packets (132259102 bytes) retransmitted 5081656 ack-only packets (3356297 delayed) 29 URG only packets 7271 window probe packets 2323163 window update packets 1294434 control packets 40195436 packets received 29477231 acks (for 1797854515 bytes) 719829 duplicate acks 0 acks for unsent data 19803825 packets (2954660057 bytes) received in-sequence 123763 completely duplicate packets (9225546 bytes) 2181 packets with some dup. data (67344 bytes duped) 472188 out-of-order packets (85660891 bytes) 1479 packets (926739 bytes) of data after window 43 window probes 201512 window update packets 1377 packets received after close 118 discarded for bad checksums 0 discarded for bad header offset fields 0 discarded because packet too short 448558 connection requests 431981 connection accepts 765275 connections established (including accepts) 896982 connections closed (including 14571 drops) 86330 embryonic connections dropped 25482179 segments updated rtt (of 25623298 attempts) 106040 retransmit timeouts 145 connections dropped by rexmit timeout 6329 persist timeouts 37659 keepalive timeouts 15537 keepalive probes sent 16876 connections dropped by keepalive udp: 145045792 packets sent 217665429 packets received 0 incomplete headers 0 bad data length fields 0 bad checksums 5359 full sockets 28004209 for no port (27999634 broadcasts, 0 multicasts) 0 input packets missed pcb cache
2.4.6 Gathering NFS Server Side Information Using ps axlmp
On an NFS server system, the
nfsd
daemon spawns
several I/O threads to service I/O requests from clients.
A sufficient number
of threads must be configured to handle the number of concurrent requests
typical for the server.
The default configuration of eight UDP and eight TCP
threads is enough for a workstation exporting a small number of directories
to a handful of clients.
For a heavily used NFS server, up to 128 server threads
can be configured and distributed over TCP and UDP.
Monitor the NFS server
threads on your server to determine if more threads are required to service
the NFS load.
To display idle I/O threads on a server system, enter:
#/usr/ucb/ps axlmp 0 | grep -v grep | grep -c nfs_udp
#/usr/ucb/ps axlmp 0 | grep -v grep | grep -c nfs_tcp
This will display a count of the number of sleeping and idle UDP and
TCP threads, respectively.
If the number of sleeping and idle threads is zero
or lower, you might improve NFS performance by increasing the number of threads.
See
Section 5.4.1
or
nfsd(8)2.4.7 Gathering NFS Client Side Information Using nfsiod
On an NFS client system, the
nfsiod
daemon spawns several I/O threads to service asynchronous I/O requests
to the server.
The I/O threads improve the performance of both NFS read and
writes.
The optimum number of I/O threads depends on many variables, such
as how quickly the client will be writing, how many files will be accessed
simultaneously, and the characteristics of the NFS server.
For small clients,
the default of seven threads is sufficient.
For larger servers with heavy
NFS client activity, more client threads may be necessary.
Monitor the NFS
client threads on your server to determine if more threads are required to
service the NFS load.
To display idle I/O threads on a client system, enter:
#ps axlm | grep -v grep | grep -c nfsiod
This will display a count of the number of sleeping and idle I/O threads.
If the number of sleeping and idle threads is often zero or lower, you might
improve NFS performance by increasing the number of threads.
See
Section 5.4.2
or
nfsiod(8)2.4.8 Monitoring Incoming Network Traffic to an NFS Server by Using the nfswatch Command
The
nfswatch
program monitors all incoming network traffic to a NFS file server and divides
it into several categories.
The number and percentage of packets received
in each category is displayed on the screen in a continuously updated display.
The
nfswatch
command can usually be run without options
and will produce useful results.
For example:
#/usr/sbin/nfswatchInterval packets: 628 (network) 626 (to host) 0 (dropped) Total packets: 6309 (network) 6307 (to host) 0 (dropped) Monitoring packets from interface ee0 int pct total unt pct total ND Read 0 0% 0 TCP Packets 0 0% 0 ND Write 0 0% 0 UDP Packets 139 10% 443 NFS Read 2 0% 2 ICMP Packets 1 0% 1 NFS Write 0 0% 0 Routing Control 81 6% 204 NFS Mount 2 0% 3 Address Resolution 109 8% 280 YP/NIS/NIS+ 0 0% 0 Reverse Addr Resol 15 1% 43 RPC Authorization 7 1% 12 Ethernet/FDDI Bdcst 406 30% 1152 Other RPC Packets 4 0% 10 Other Packets 1087 80% 3352 18 NFS Procedures [10 not displayed] more-> Procedure int pct total completed avg(msec) std dev max resp CREATE 0 0% 0 GETATTR 1 50% 1 1 3.90 3.90 GETROOT 0 0% 0 LINK 0 0% 0 LOOKUP 0 0% 0 MKDIR 0 0% 0 NULLPROC 0 0% 0 READ 0 0% 0
Note
The
nfswatchcommand monitors and displays data for only file systems mounted with NFS Version 2.0.
Your kernel must be configured with the
packetfilter
option.
After kernel configuration, any user can invoke
nfswatch
once the superuser has enabled promiscuous-mode operation using
the
pfconfig
command.
See
packetfilter(7)2.5 Additional Tools for Monitoring Performance
You may want to set up a routine to continuously monitor system performance. Some monitoring tools will alert you when serious problems occur (for example, mail). It is important that you choose a monitoring tool that has low overhead to obtain accurate performance information.
Table 2-1
describes the tools that you use to
continuously monitor performance.
Table 2-1: Tools for Continuous Performance Monitoring
| Name | Description |
Performance Visualizer |
Graphically displays the performance of all significant components of a parallel system. Using Performance Visualizer, you can monitor the performance of all the member systems in a cluster. It monitors the performance of several systems simultaneously, it allows you to see the impact of a parallel application on all the systems, and to ensure that the application is balanced across all systems. When problems are identified, you can change the application code and use Performance Visualizer to evaluate the impact of these changes. Performance Visualizer is a Tru64 UNIX layered product and requires a license. It also helps you identify overloaded systems, underutilized resources, active users, and busy processes. You can choose to look at all of the hosts in a parallel system or at individual hosts. See the Performance Visualizer documentation for more information. |
|
Collects a variety of performance data
on a running system and either displays the information or saves it to a binary
file.
The
|
|
Provides continuous reports on the
state of the system, including a list of the processes using the most CPU
resources.
The
|
|
Displays the system load average in
a histogram that is periodically updated.
See
|
|
Provides information about activity
on volumes, plexes, subdisks, and disks under LSM control.
The
|
|
Monitors LSM for failures in disks, volumes, and plexes, and sends mail if a failure occurs. See Section 9.3 for information. |
2.6 Gathering Profiling and Debugging Information
You can use profiling to identify sections of application code that consume large portions of execution time, and you can use these tools to profile and debug the kernel. To improve performance, concentrate on improving the coding efficiency of those time-intensive sections.
Table 2-2 describes the commands you can use to obtain information about applications. Detailed information about these tools is located in the Programmer's Guide and the Kernel Debugging manual.
In addition,
prof_intro(1)Table 2-2: Application Profiling and Debugging Tools
| Name | Use | Description |
Profiles applications |
Consists of a set
of prepackaged tools ( |
|
Checks memory access and detects memory leaks in applications |
Performs memory access checks and memory
leak detection of C and C++ programs at run time, by using the
|
|
Produces a profile of procedure execution times in an application |
An
The
|
|
Profiles basic blocks in an application |
Produces a profile showing the number
of times each instruction was executed in a program.
The information can be
reported as tables or can be used to automatically direct later optimizations
by using the
The
The
|
|
Analyzes profiling data |
Analyzes profiling data and produces statistics showing which portions of code consume the most time and where the time is spent (for example, at the routine level, the basic block level, or the instruction level). The
The information produced by
|
|
Analyzes profiling data and displays procedure call information and statistical program counter sampling in an application |
Analyzes profiling data and allows you to determine which routines are called most frequently, and the source of the routine call, by gathering procedure call information and performing statistical program counter (PC) sampling. The
|
|
|
Profiles user code in an application |
Profiles
user code using performance counters in the Alpha chip.
The
|
Produces a program counter profile of a running kernel |
Profiles a running kernel using the performance
counters on the Alpha chip.
You analyze the performance data collected by
the tool with the
prof
command.
See
kprofile(1) |
|
Visual Threads |
Identifies bottlenecks and performance problems in multithreaded applications |
Enables you to analyze and refine your multithreaded applications. You can use Visual Threads to identify bottlenecks and performance problems, and to debug potential thread-related logic problems. Visual Threads uses rule-based analysis and statistics capabilities and visualization techniques. Visual Threads is licensed as part of the Developers' Toolkit for Tru64 UNIX. |
Debugs running kernels, programs, and crash dumps, and examines and temporarily modifies kernel variables |
Provides source-level debugging for
C, Fortran, Pascal, assembly language, and machine code.
The
Use
|
|
Debugs running kernels and crash dumps |
Allows you to examine
a running kernel or a crash dump.
The
You can also use extensions to
check resource usage (for example, CPU usage).
See
|
|
Debugs kernels and applications |
Debugs
programs and the kernel and helps locate run-time programming errors.
The
|
|
Displays open files |
Displays information about files that
are currently opened by the running processes.
The
|