HP OpenVMS Programming Concepts Manual

Contents

Index

4.4 Using Affinity and Capabilities in CPU Scheduling (Alpha and I64 Only)

On Alpha and I64 systems, the affinity and capabilities mechanisms allow CPU scheduling to be adapted to larger CPU configurations by controlling the distribution of processes or threads throughout the active CPU set. Control of the distribution of processes throughout the active CPU set becomes more important as higher-performance server applications, such as databases and real-time process-control environments, are implemented. Affinity and capabilities provide the user with opportunities to perform the following tasks:

Create and modify a set of user-defined process capabilities
Create and modify a set of user-defined CPU capabilities to match those in the process
Allow a process to apply the affinity mechanisms to a subset of the active CPU set in a symmetric multiprocessing (SMP) configuration

4.4.1 Defining Affinity and Capabilities

The affinity mechanism allows a process, or each of its kernel threads, to specify an exact set of CPUs on which it can execute. The capabilities mechanism allows a process to specify a set of resources that a CPU in the active set must have defined before it is allowed to contend for process execution. Presently, both of these mechanisms are present in the OpenVMS scheduling mechanism; both are used extensively internally and externally to implement parts of the I/O and timing subsystems. Now, however, the OpenVMS operating system provides user access to these mechanisms.

4.4.1.1 Using Affinity and Capabilities with Caution

It is important for users to understand that inappropriate and abusive use of the affinity and capabilities mechanisms can have a negative impact on the symmetric aspects of the current multi-CPU scheduling algorithm.

4.4.2 Types of Capabilities

Capabilities are resources assigned to CPUs that a process needs to execute correctly. There are four defined capabilities. They are restricted to internal system events or functions that control system states or functions. Table 4-6 describes the four capabilities.

Table 4-6 Capabilities
Capability Description

Primary Owned by only one CPU at a time, since the primary could possibly migrate from CPU to CPU in the configuration. For I/O and timekeeping functions, the system requires that the process run on the primary CPU. The process requiring this capability is allowed to run only on the processor that has it at the time.

Run Controls the ability of a CPU to execute processes. Every process requires this resource; if the CPU does not have it, scheduling for that CPU comes to a halt in a recognized state. The command STOP/CPU uses this capability when it is trying to make the CPU quiescent, bringing it to a halted state.

Quorum Used in a cluster environment when a node wants another node to come to a quiescent state until further notice. Like the Run capability, Quorum is a required resource for every process and every CPU in order for scheduling to occur.

Vector Like the primary capability, it reflects a feature of the CPU; that is, that the CPU has a vector processing unit directly associated with it. Obsolete on OpenVMS Alpha and OpenVMS I64 systems but is retained as a compatibility feature with OpenVMS VAX.

**Table 4-6 Capabilities**
Capability	Description
Primary	Owned by only one CPU at a time, since the primary could possibly migrate from CPU to CPU in the configuration. For I/O and timekeeping functions, the system requires that the process run on the primary CPU. The process requiring this capability is allowed to run only on the processor that has it at the time.
Run	Controls the ability of a CPU to execute processes. Every process requires this resource; if the CPU does not have it, scheduling for that CPU comes to a halt in a recognized state. The command STOP/CPU uses this capability when it is trying to make the CPU quiescent, bringing it to a halted state.
Quorum	Used in a cluster environment when a node wants another node to come to a quiescent state until further notice. Like the Run capability, Quorum is a required resource for every process and every CPU in order for scheduling to occur.
Vector	Like the primary capability, it reflects a feature of the CPU; that is, that the CPU has a vector processing unit directly associated with it. Obsolete on OpenVMS Alpha and OpenVMS I64 systems but is retained as a compatibility feature with OpenVMS VAX.

4.4.3 Looking at User Capabilities

Previously, the use of capabilities was restricted to system resources and control events. However, it is also valuable for user functions to be able to indicate a resource or special CPU function.

There are 16 user-defined capabilities added to both the process and the CPU structures. Unlike the static definitions of the current system capabilities, the user capabilities have meaning only in the context of the processes that define them. Through system service interfaces, processes or individual threads of a multithreaded process, can set specific bits in the capability masks of a CPU to give it a resource, and can set specific bits in the kernel thread's capability mask to require that resource as an execution criterion.

The user capability feature is a direct superset of the current capability functionality. All currently existing capabilities are placed into the system capability set; they are not available to the process through system service interfaces. These system service interfaces affect only the 16 bits specifically set aside for user definition.

The OpenVMS operating system has no direct knowledge of what the defined capability is that is being used. All responsibility for the correct definition, use, and management of these bits is determined by the processes that define them. The system controls the impact of these capabilities through privilege requirements; but, as with the priority adjustment services, abusive use of the capability bits could affect the scheduling dynamic and CPU loads in an SMP environment.

4.4.4 Using the Capabilities System Services

The SYS$CPU_CAPABILITIES and SYS$PROCESS_CAPABILITIES system services provide access to the capability features. By using the SYS$CPU_CAPABILITIES and SYS$PROCESS_CAPABILITIES services, you can assign user capabilities to a CPU and to a specific kernel thread. Assigning a user capability to a CPU lasts either for the life of the system or until another explicit change is made. This operation has no direct effect on the scheduling dynamics of the system; it only indicates that the specified CPU is capable of handling any process or thread that requires that resource. If a process does not indicate that it needs that resource, it ignores the CPU's additional capability and schedules the process on the basis of other process requirements.

Assigning a user capability requirement to a specific process or thread has a major impact on the scheduling state of that entity. For the process or thread to be scheduled on a CPU in the active set, that CPU must have the capability assigned prior to the scheduling attempt. If no CPU currently has the correct set of capability requirements, the process is placed into a wait state until a CPU with the right configuration becomes available. Like system capabilities, user process capabilities are additive; that is, for a CPU to schedule the process, the CPU must have the full complement of required capabilities.

These services reference both sets of 16-bit user capabilities by the common symbolic constant names of CAP$M_USER1 through CAP$M_USER16. These names reflect the corresponding bit position in the appropriate capability mask; they are nonzero and self-relative to themselves only.

Both services allow multiple bits to be set or cleared, or both, simultaneously. Each takes as parameters a select mask and a modify mask that define the operation set to be performed. The service callers are responsible for setting up the select mask to indicate the user capabilities bits affected by the current call. This select mask is a bit vector of the ORed bit symbolic names that, when set, states that the value in the modify mask is the new value of the bit. Both masks use the symbolic constants to indicate the same bit; alternatively, if appropriate, you can use the symbolic constant CAP$K_USER_ALL in the select mask to indicate that the entire set of capabilities is affected. Likewise, you can use the symbolic constant CAP$K_USER_ADD or CAP$K_USER_REMOVE in the modify mask to indicate that all capabilities specified in the select mask are to be either set or cleared.

For information about using the SYS$CPU_CAPABILITIES and SYS$PROCESS_CAPABILITIES system services, see the HP OpenVMS System Services Reference Manual: A--GETUAI and HP OpenVMS System Services Reference Manual: GETUTC--Z.

4.4.5 Types of Affinity

There are two types of affinity: implicit and explicit. This section describes both.

4.4.5.1 Implicit Affinity

Implicit affinity, sometimes known as soft affinity, is a variant form of the original affinity mechanism used in the OpenVMS scheduling mechanisms. Rather than require a process to stay on a specific CPU regardless of conditions, implicit affinity maximizes cache and translation buffer (TB) context by maintaining an association with the CPU that has the most information about a given process.

Currently, the OpenVMS scheduling mechanism already has a version of implicit affinity. It keeps track of the last CPU the process ran on and tries to schedule itself to that CPU, subject to a fairness algorithm. The fairness algorithm makes sure a process is not skipped too many times when it normally would have been scheduled elsewhere.

The Alpha architecture lends itself to maintaining cache and TB context that has significant potential for performance improvement at both the process and system level. Because this feature contradicts the normal highest-priority process-scheduling algorithms in an SMP configuration, implicit affinity cannot be a system default.

The SYS$SET_IMPLICIT_AFFINITY system service provides implicit affinity support. This service works on an explicitly specified process or kernel thread block (KTB) through the pidadr and prcnam arguments. The default is the current process, but if the symbolic constant CAP$K_PROCESS_DEFAULT is specified in pidadr, the bit is set in the global default cell SCH$GL_DEFAULT_PROCESS_CAP. Setting implicit affinity globally is similar to setting a capability bit in the same mask, because every process creation after the modification picks up the bit as a default that stays in effect across all image activations.

The protections required to invoke SYS$SET_IMPLICIT_AFFINITY depend on the process that is being affected. Because the addition of implicit affinity has the same potential as the SYS$ALTPRI service for affecting the priority scheduling of processes in the COM queue, ALTPRI protection is required as the base which all modification forms of the serve must have to invoke SYS$SET_IMPLICIT_AFFINITY. If the process is the current one, no other privilege is required. To affect processes in the same UIC group, the GROUP privilege is required. For any other processes in the system, the WORLD privilege is required.

4.4.5.2 Explicit Affinity

Even though capabilities and affinity overlap considerably in their functional behavior, they are nonetheless two discrete scheduling mechanisms. Affinity, the subsetting of the number of CPUs on which a process can execute, has precedence over the capability feature and provides an explicit binding operation between the process and CPU. It forces the scheduling algorithm to consider only the CPU set it requires, and then applies the capability tests to see whether any of them are appropriate.

Explicit affinity allows database and high-performance applications to segregate application functions to individual CPUs, providing improved cache and TB performance as well as reducing context switching and general scheduling overhead. During the IPL 8 scheduling pass, the process is investigated to see to which CPUs it is bound and whether the current CPU is one of those. If it passes that test, capabilities are also validated to allow the process to context switch. The number of CPUs that can be supported is 32.

The SYS$PROCESS_AFFINITY system service provides access to the explicit affinity functionality. SYS$PROCESS_AFFINITY resolves to a specific process, defaulting to the current one, through the pidadr and prcnam arguments. Like the other system services, the CPUs that are affected are indicated through select_mask, and the binding state of each CPU is specified in modify_mask.

Specific CPUs can be referenced in select_mask and modify_mask using the symbolic constants CAP$M_CPU0 through CAP$M_CPU31. These constants are defined to match the bit position of their associated CPU ID. Alternatively, specifying CAP$K_ALL_ACTIVE_CPUS in select_mask sets or clears explicit affinity for all CPUs in the current active set.

Explicit affinity, like capabilities, has a permanent process as well as current image copy. As each completed image is run down, the permanent explicit affinity values overwrite the running image set, superseding any changes that were made in the interim. Specifying CAP$M_FLAG_PERMANENT in the flags parameter indicates that both the current and permanent processes are to be modified simultaneously. As a result, unless explicitly changed again, this operation has a scope from the current image through the end of the process life.

For information about the SYS$SET_IMPLICIT_AFFINITY and SYS$PROCESS_AFFINITY system services, see the HP OpenVMS System Services Reference Manual: A--GETUAI and HP OpenVMS System Services Reference Manual: GETUTC--Z.

4.5 Using the Class Scheduler in CPU Scheduling

The class scheduler gives you the ability to limit the amount of CPU time that a system's users may receive by placing the users into scheduling classes. Each class is assigned a percentage of the overall system's CPU time. As the system runs, the combined set of users in a class are limited to the percentage of CPU execution time allocated to their class. The users may get some additional CPU time if the qualifier /WINDFALL is enabled for their scheduling class. Enabling the qualifier /WINDFALL allows the system to give a small amount of CPU time to a scheduling class when a CPU is idle and the scheduling class's allotted time has been depleted.

To invoke the class scheduler, you use the SYSMAN interface. SYSMAN allows a user to create, delete, modify, suspend, resume, and display scheduling classes. Table 4-7 shows the SYSMAN command, class_schedule, and its subcommands.

Table 4-7 SYSMAN Command: Class_Schedule
Subcommand Meaning

Add Creates a new scheduling class

Delete Deletes a scheduling class

Modify Modifies the characteristics of a scheduling class

Show Shows the characteristics of a scheduling class

Suspend Suspends temporarily a scheduling class

Resume Resumes a scheduling class

**Table 4-7 SYSMAN Command: Class_Schedule**
Subcommand	Meaning
Add	Creates a new scheduling class
Delete	Deletes a scheduling class
Modify	Modifies the characteristics of a scheduling class
Show	Shows the characteristics of a scheduling class
Suspend	Suspends temporarily a scheduling class
Resume	Resumes a scheduling class

4.5.1 Specifications for the Class_Schedule Command

The full specifications for Class_Schedule and its subcommands are as follows:

4.5.1.1 The Add Subcommand

The format for the Add subcommand is as follows:

SYSMAN>class_schedule add "class name" /cpulimit = ([primary], [h1-h2=time%],[h1=time%], [,...],[secondary],[h1-h2=time%],[h1=time%],[,...]) [/primedays = ([no]day[,...])] [/username = (name1, name2,...name"n")] [/account = (name1, name2,...name"n")] [/uic = (uic1,uic2,...uic"n")] [/windfall]

The Class Name and Qualifiers

The class name is the name of the scheduling class. It must be specified and the maximum length for this name is 16 characters.

Table 4-8 shows the qualifiers and their meanings for this SYSMAN command.

Table 4-8 Class Name Qualifiers
Qualifier Meaning

/CPULIMIT Defines the maximum amount of CPU time that this scheduling class can receive for the specified days and hours. You must specify this qualifier when adding a class.
The h1-h2=time% syntax allows you to specify a range of hours followed by the maximum amount of CPU time (expressed as a percentage) to be associated with this set of hours. The first set of hours after the keyword PRIMARY specifies hours on primary days; the set of hours after the keyword SECONDARY specifies hours on secondary days. The hours are inclusive; if you class schedule a given hour, access extends to the end of that hour.

/PRIMEDAYS Allows you to define which days are primary days and which days are secondary days. You specify primary days as MON, TUE, WED, THU, FRI, SAT, and SUN. You specify secondary days as NOMON, NOTUE, NOWED, NOTHU, NOFRI, NOSAT, and NOSUN. The default is MON through FRI and NOSAT and NOSUN. Any days omitted from the list take their default value. You can use the DCL command, SET DAY, to override the class definition of primary and secondary days.

/USERNAME Specifies which user is part of this scheduling class. This is part of a user's SYSUAF record.

/ACCOUNT Specifies which user is part of this scheduling class. This is part of a user's SYSUAF record.

/UIC Specifies which users are part of this scheduling class. This is part of a user's SYSUAF record.

/WINDFALL Specifies that all processes in the scheduling class are eligible for windfall. By enabling windfall, you allow processes in the scheduling class to receive a "windfall," that is, a small percentage of CPU time, when the class' allotted CPU time has been depleted and a CPU is idle. Rather than let the CPU remain idle, you might decide that it is better to let these processes execute even if it means giving them more than their alloted time.
The default value is for windfall to be disabled.

**Table 4-8 Class Name Qualifiers**
Qualifier	Meaning
/CPULIMIT	Defines the maximum amount of CPU time that this scheduling class can receive for the specified days and hours. You must specify this qualifier when adding a class. The h1-h2=time% syntax allows you to specify a range of hours followed by the maximum amount of CPU time (expressed as a percentage) to be associated with this set of hours. The first set of hours after the keyword PRIMARY specifies hours on primary days; the set of hours after the keyword SECONDARY specifies hours on secondary days. The hours are inclusive; if you class schedule a given hour, access extends to the end of that hour.
/PRIMEDAYS	Allows you to define which days are primary days and which days are secondary days. You specify primary days as MON, TUE, WED, THU, FRI, SAT, and SUN. You specify secondary days as NOMON, NOTUE, NOWED, NOTHU, NOFRI, NOSAT, and NOSUN. The default is MON through FRI and NOSAT and NOSUN. Any days omitted from the list take their default value. You can use the DCL command, SET DAY, to override the class definition of primary and secondary days.
/USERNAME	Specifies which user is part of this scheduling class. This is part of a user's SYSUAF record.
/ACCOUNT	Specifies which user is part of this scheduling class. This is part of a user's SYSUAF record.
/UIC	Specifies which users are part of this scheduling class. This is part of a user's SYSUAF record.
/WINDFALL	Specifies that all processes in the scheduling class are eligible for windfall. By enabling windfall, you allow processes in the scheduling class to receive a "windfall," that is, a small percentage of CPU time, when the class' allotted CPU time has been depleted and a CPU is idle. Rather than let the CPU remain idle, you might decide that it is better to let these processes execute even if it means giving them more than their alloted time. The default value is for windfall to be disabled.

4.5.1.2 The Delete Subcommand

The format for the Delete subcommand is as follows:

SYSMAN>class_schedule delete "class name"

The Delete subcommand deletes the scheduling class from the class scheduler database file, and all processes that are members of this scheduling class are no longer class scheduled.

4.5.1.3 The Modify Subcommand

The format for the Modify subcommand is as follows:

SYSMAN>class_schedule modify "class name" /cpulimit = ([primary], [h1-h2=time%],[h1=time%], [,...],[secondary],[h1-h2=time%],[h1=time%],[,...]) [/primedays = ([no]day[,...])] [/username = (name1, name2,...name"n")] [/account = (name1, name2,...name"n")] [/uic = (uic1,uic2,...uic"n")] [/(no)windfall]

The Modify subcommand changes the characteristics of a scheduling class. The qualifiers are the same qualifiers as for the add subcommand. To remove a time restriction, specify a zero (0) for the time percentage associated with a particular range of hours.

To remove a name or uic value, you must specify a minus sign in front of each name or value.

4.5.1.4 The Show Subcommand

The format for the Show subcommand is as follows:

SYSMAN>class_schedule show [class name] [/all] [/full]

Table 4-9 shows the qualifiers and their meanings for this SYSMAN command.

Table 4-9 Show Subcommand Qualifiers
Qualifier Meaning

/ALL Displays all scheduling classes. The qualifier must be specified if no class name is given.

/FULL Displays all information about his scheduling class.

**Table 4-9 Show Subcommand Qualifiers**
Qualifier	Meaning
/ALL	Displays all scheduling classes. The qualifier must be specified if no class name is given.
/FULL	Displays all information about his scheduling class.

Note

By default, a limited display of data is shown by this subcommand. The default shows the following:

Name
Maximum CPU times for reach range of hours
Primary days and secondary days
Windfall settings

4.5.1.5 The Suspend Subcommand

The format for the Suspend subcommand is as follows:

SYSMAN>class_schedule suspend "class name"

The Suspend subcommand suspends the specified scheduling class. All processes that are part of this scheduling class remain as part of this scheduling class but are granted unlimited CPU time.

4.5.1.6 The Resume Subcommand

The format of the Resume subcommand is as follows:

SYSMAN>class_schedule resume "class name"

The Resume subcommand complements the suspend command. You use this command to resume a scheduling class that is currently suspended.

4.5.2 The Class Scheduler Database

The class scheduler database is a permanent database that allows OpenVMS to class schedule processes automatically after a system has been booted and rebooted. This database resides on the system disk in SYS$SYSTEM: VMS$CLASS_SCHEDULE.DATA. SYSMAN creates this file as an RMS indexed file when the first scheduling class is created by the SYSMAN command, class_schedule add.

4.5.2.1 The Class Scheduler Database and Process Creation

By using a permanent class scheduler, a process is placed into a scheduling class, if appropriate, at process creation time. When a new process is created, it needs to be determined whether this process belongs to a scheduling class. Since to determine this relies upon data in the SYSUAF file, and the Loginout image already has the process' information from this file, Loginout class schedules the process if it determines that the process belongs to a scheduling class.

There are two other types of processes to consider during process creation: subprocess and detached process. A subprocess becomes part of the same scheduling class as the parent process, even though it may not match the class's criteria. That is, its user and account name and/or UIC may not be part of the class's record. A detached process only joins a scheduling class if it executes the Loginout image (Loginout.exe) during process creation.

Though a process can join a scheduling class at process creation time, you can change or modify its scheduling class during runtime with the SET PROCESS/SCHEDULING_CLASS command.

Contents

Index