Managing the Auditing Subsystem

This section discusses how to manage the auditing system. Management tasks include the following:

Enabling and disabling startup of the audit server process
Changing the point in startup when the operating system initiates auditing
Choosing the number of outstanding messages that trigger process suspension
Choosing the audit server response to memory exhaustion
Maintaining the accuracy of message time-stamping
Adjusting the transfer of messages from system auditing buffers to disk
Choosing the amount of disk space periodically allocated to the system audit log

Tasks Performed by the Audit Server

The operating system creates the audit server as a detached process during system startup to perform the following tasks:

Create a clusterwide security audit log file (SECURITY.AUDIT$JOURNAL) in SYS$COMMON:[SYS$MGR]
Control the logging of security events to the log file and the delivery of alarms to any operator terminals enabled to receive security class messages
Enable auditing of a site-defined set of security events
Monitor disk and memory resources
Maintain a database of security-auditing characteristics

The audit server sends informational and error messages to the operator communication manager (OPCOM). OPCOM broadcasts these messages to operator terminals and writes the messages to the operator log file.

Example 9-9 “Default Characteristics of the Audit Server” displays the audit server's initial operating values. These settings are stored in the audit server database, VMS$AUDIT_SERVER.DAT in SYS$COMMON:[SYSMGR]. Any time you modify security-auditing characteristics by using the DCL command SET AUDIT, the audit server database is updated. Each time the system is rebooted, it takes the auditing values from this database.

Example 9-9 Default Characteristics of the Audit Server

$ SHOW AUDIT/ALL

List of audit journals:
  Journal name:           SECURITY
  Journal owner:          (system audit journal)
  Destination:            SYS$COMMON:[SYSMGR]SECURITY.AUDIT$JOURNAL
  Monitoring:             enabled
    Warning thresholds,   Block count:    100   Duration:  2 00:00:00.0
    Action thresholds,    Block count:     25   Duration:  0 00:30:00.0
 
Security auditing server characteristics:
  Database version:       4.4
  Backlog (total):        100, 200, 300
  Backlog (process):      5, 2
  Server processing intervals:
    Archive flush:        0 00:01:00.00
    Journal flush:        0 00:05:00.00
    Resource scan:        0 00:05:00.00
  Final resource action:  purge oldest audit events
 
Security archiving information:
  Archiving events:       none
  Archive destination:
 
System security alarms currently enabled for:
  ACL
  Authorization
  Breakin:     dialup,local,remote,network,detached
  Logfailure:  batch,dialup,local,remote,network,subprocess,detached,server
 
System security audits currently enabled for:
  ACL
  Authorization
  Breakin:     dialup,local,remote,network,detached
  Logfailure:  batch,dialup,local,remote,network,subprocess,detached,server

Disabling and Reenabling Startup of the Audit Server

All operating systems start the audit server process and OPCOM by default.

If the physical memory or disk storage space on your system is especially limited and logging of security-related events is not important, you can remove the audit server and OPCOM processes from the system startup procedure. Before you do so, be aware that cluster object support requires the audit server (see Chapter 11 “Securing a Cluster”). The following example shows how you would remove these processes with the System Management utility (SYSMAN):

$ SET PROCESS/PRIVILEGES=(OPER,BYPASS)
$ RUN SYS$SYSTEM:SYSMAN
SYSMAN> STARTUP SET DATABASE STARTUP$STARTUP_VMS
SYSMAN> STARTUP DISABLE FILE VMS$CONFIG-050_OPCOM.COM/NODE=*
SYSMAN> STARTUP DISABLE FILE VMS$CONFIG-050_AUDIT_SERVER.COM /NODE=*
SYSMAN> EXIT

$ SET PROCESS/PRIVILEGES=(NOOPER,NOBYPASS)

To delete the audit server process and shut down security auditing on the system, enter the following commands on each node in the cluster:

$ SET AUDIT/ALARM/AUDIT/DISABLE=ALL/CLASS=*
$ SET AUDIT/SERVER=EXIT

You can restart security auditing and OPCOM on the system by executing the following DCL command lines:

$ @SYS$SYSTEM:STARTUP OPCOM
$ @SYS$SYSTEM:STARTUP AUDIT_SERVER

To start the OPCOM and the audit server processes for all subsequent system boots, reverse your previous edits of the system startup procedure. Use the following SYSMAN commands:

$ SET PROCESS/PRIVILEGES=(OPER,BYPASS)
$ RUN SYS$SYSTEM:SYSMAN
SYSMAN> STARTUP SET DATABASE STARTUP$STARTUP_VMS
SYSMAN> STARTUP ENABLE FILE VMS$CONFIG-050_OPCOM.COM/NODE=*
SYSMAN> STARTUP ENABLE FILE VMS$CONFIG-050_AUDIT_SERVER.COM -
_SYSMAN> /NODE=*
 
SYSMAN> EXIT
 
$ SET PROCESS/PRIVILEGES=(NOOPER,NOBYPASS)

See the HP OpenVMS System Management Utilities Reference Manual for more information about SYSMAN.

Changing the Point in Startup When the Operating System Initiates Auditing

Ordinarily, the operating system starts sending audit-event messages just before SYSTARTUP_VMS.COM executes. However, a site that is not interested in receiving audit-event messages during startup can alter this behavior by redefining the logical name SYS$AUDIT_SERVER_INHIBIT.

To change the point where the operating system begins to deliver security event messages, add the following line to the SYS$MANAGER:SYLOGICALS.COM command procedure:

$ !
$ DEFINE /SYSTEM /EXECUTIVE SYS$AUDIT_SERVER_INHIBIT yes
$ !

A system manager can choose another phase of system startup to initiate auditing, perhaps at the end of SYSTARTUP_VMS. However, be sure to initiate auditing before allowing any general logins to the system (that is, before any SET LOGINS/INTERACTIVE command). To initiate delivery of auditing messages, add the following line to the appropriate command file:

$ !
$ SET AUDIT/SERVER=INITIATE
$ !

Choosing the Number of Outstanding Messages That Trigger Process Suspension

Unless the audit server controls the influx of messages, it is possible under some conditions to run out of memory. A very slow I/O device, a disk space problem, or even a sudden onslaught of messages can exceed the server's ability to write messages to disk. To prevent memory exhaustion, the audit server constantly monitors the total number of outstanding messages and tallies the number of messages contributed by each active process. If the server receives more events than it can log to disk, it begins applying flow control to those processes generating audit events.

Controlling Message Flow

Message volume is controlled on a per-process basis. Table 9-7 “Controlling the Flow of Audit Event Messages” shows the three stages of flow control.

Table 9-7 Controlling the Flow of Audit Event Messages

Control Stages	Total Message Backlog (Default)	Process Backlog Limit (Default)
1	100	5
2	200	2
3	300	None

When there are 100 messages in memory, the operating system suspends any process that has five or more outstanding messages. Once a process has all its messages written to the log file, it can resume processing.
When there are 200 messages in memory, the operating system suspends any process that has submitted two or more messages until all messages are written to disk.
When there are 300 messages in memory, any process with messages in memory is suspended until all messages are written to disk.

You can establish site-specific values for controlling messages by using the /BACKLOG qualifier to the SET AUDIT command. For example, the following command raises the action thresholds so that the operating system starts controlling the influx of messages when it has 125 unprocessed messages in its queue and a contributing process has eight messages outstanding:

$ SET AUDIT/BACKLOG=(TOTAL=(125,250,350),PROCESS=(8,4) )

Preventing Process Suspension

Naturally, the operating system never suspends certain critical processes. Realtime processes and any of the following processes are exempt:

CACHE_SERVER	CLUSTER_SERVER
CONFIGURE	DFS$COM_ACP
DNS$ADVER	IPCACP
JOB_CONTROL	NETACP
NET$ACP	OPCOM
REMACP	SHADOW_SERVER
SMISERVER	SWAPPER
TP_SERVER	VWS$DISPLAYMGR
VWS$EMULATORS

You can prevent the suspension of a process by adding its process identifier (PID) to the process exclusion list. Use the following form of the SET AUDIT command:

SET AUDIT/EXCLUDE=process-id

Be aware that processes (PIDs) are not automatically removed from the process exclusion list when processes log out of the system. To remove a process from the exclusion list, use the SET AUDIT/NOEXCLUDE command. Processes excluded by the operating system cannot be removed.

Reacting to Insufficient Memory

When processes on the exclusion list (see “Preventing Process Suspension”) produce so many audit messages that the audit server runs out of memory, the default behavior of the audit server is to remove old event messages until memory is available. It saves the most current messages.

The audit server has other alternatives when it encounters memory limitations:

Option	Description
Crash	Crash the system if the audit server runs out of memory.
Ignore_New	Ignore new event messages until memory is available. New event messages are lost but event messages in memory are saved.
Purge_Old (default)	Remove old event messages until memory is available for the most current messages.

To alter the default behavior of the audit server and instruct it to ignore all new audit messages rather than purge the old ones, enter the following command:

$ SET AUDIT/SERVER=FINAL_ACTION=IGNORE_NEW

The audit server runs with a fixed virtual memory limit (PGFLQUOTA) of 20,480 pages. This may be further limited by the size of page files installed on the system. You can adjust the size of page files by running AUTOGEN. Whenever it detects a page file problem, AUTOGEN automatically resets the size to alleviate the problem.

Maintaining the Accuracy of Message Time-Stamping

If you are auditing a set of security events in which the order of occurrence is important, all clocks within a cluster need to remain synchronized. This ensures that message time-stamping on all nodes in the cluster closely reflects the order in which events occurred.

Because each node in a cluster configuration maintains time independently, it is possible for cluster times to drift apart over time. To prevent drifting, use the SYSMAN command CONFIGURATION SET TIME at regular intervals. The HP OpenVMS System Management Utilities Reference Manual provides a sample command procedure that you can run every hour to maintain clock synchronization to within a second.

Adjusting the Transfer of Messages to Disk

The audit server stores security event messages in memory and periodically transfers groups of messages from its buffers to the audit log file on disk. Usually, the audit server transfers auditing messages every 5 minutes and archived messages (see “Using a Remote Log File”) every minute. Except for some high-security environments and instances where extreme numbers of audit messages are being generated on the system, this default should be sufficient.

High-security sites can transfer event messages to disk at higher than normal rates by modifying the interval of log transfer operations. The following command, for example, changes the audit server's characteristics so it writes event messages to the audit log file every 2 minutes:

$ SET AUDIT/INTERVAL=JOURNAL_FLUSH=00:02

Frequent message transfers can impact system performance, however, because the system performs more I/O operations rather than store messages in the system buffers associated with the audit server process.

To immediately force all audit messages to the log file, enter the following command:

$ SET AUDIT/SERVER=FLUSH

Allocating Disk Space for the Audit Log File

The audit server constantly monitors the disk space allocated to the security audit log file to ensure there is adequate space for event messages. Whenever the file runs low on available blocks, the audit server extends the audit log file. If disk resource limitations prevent the server from allocating more blocks to the log file, it takes one of the following actions:

Warns you by sending warning messages to the operator terminal. This occurs by default when less than 100 disk blocks are available.
The following command changes the default so the warning occurs when 150 blocks are available:
$ SET AUDIT /JOURNAL=SECURITY /THRESHOLD=WARNING=150
Takes action by suspending processes that are generating audit records. (Certain processes are immune to this: see “Preventing Process Suspension”.) When resource monitoring is enabled for the log file, process suspension occurs when less than 25 disk blocks are available.
To modify the action threshold to 50 blocks, enter the following command:
$ SET AUDIT /JOURNAL=SECURITY /THRESHOLD=ACTION=50

The threshold values may be expressed in blocks or as a delta time. Delta time values are multiplied by the average space consumption rate to yield a number of blocks. The maximum of the block and time threshold values is used as the active threshold value.

Error Handling in the Auditing Facility

Resources consumed by the OpenVMS security-auditing facility vary with the number and type of system events being recorded. Three different error conditions can develop related to the auditing facility:

The audit server can run out of memory. “Reacting to Insufficient Memory” describes different methods of handling the situation.
The disk storing the audit log file can run out of space.
The network connection for a remote log file (archive file) can break.

This section discusses the default behavior of the auditing system in monitoring disk space and logging to an archive file.

Disabling Disk Monitoring

The audit server monitors the audit log file and regularly pre-extends its disk block allocation to ensure there is adequate space for incoming event messages. Whenever disk space is unavailable, the server first warns you through operator messages and then resorts to suspending certain contributing processes (see “Allocating Disk Space for the Audit Log File” ). If you find many processes suspended for no apparent reason, it is probably because your audit disk is full. Once you correct the disk space problem, you can resume suspended processes with the SET AUDIT/SERVER=RESUME command (rather than wait for the next resource scan).

You can disable resource monitoring altogether by entering the following command:

$ SET AUDIT/JOURNAL=SECURITY/RESOURCE=DISABLE

However, if you disable disk resource monitoring, you eliminate the opportunity to receive warning messages until it is too late. The audit server begins to suspend processes that are generating too many audits, as “Choosing the Number of Outstanding Messages That Trigger Process Suspension” describes, and if it runs out of memory, the server takes the action described in “Reacting to Insufficient Memory”: it ignores messages, purges old messages, or, possibly, crashes the system.

Once disk space becomes available, the audit server extends the log file and resumes any processes it suspended.

Losing the Link to a Remote Log File

If you are writing auditing messages to a remote log file, as described in “Using a Remote Log File”, the link between the local and remote node can fail. Should this happen, the audit server broadcasts a warning message to all operator terminals and attempts to reestablish the link every minute until the connection is made.