HP OpenVMS systems
On Alpha and I64 systems, the affinity and capabilities mechanisms allow CPU scheduling to be adapted to larger CPU configurations by controlling the distribution of processes or threads throughout the active CPU set. Control of the distribution of processes throughout the active CPU set becomes more important as higher-performance server applications, such as databases and real-time process-control environments, are implemented. Affinity and capabilities provide the user with opportunities to perform the following tasks:
The affinity mechanism allows a process, or each of its kernel threads,
to specify an exact set of CPUs on which it can execute. The
capabilities mechanism allows a process to specify a set of resources
that a CPU in the active set must have defined before it is allowed to
contend for process execution. Presently, both of these mechanisms are
present in the OpenVMS scheduling mechanism; both are used extensively
internally and externally to implement parts of the I/O and timing
subsystems. Now, however, the OpenVMS operating system provides user
access to these mechanisms.
220.127.116.11 Using Affinity and Capabilities with Caution
It is important for users to understand that inappropriate and abusive
use of the affinity and capabilities mechanisms can have a negative
impact on the symmetric aspects of the current multi-CPU scheduling
4.4.2 Types of Capabilities
Capabilities are resources assigned to CPUs that a process needs to execute correctly. There are four defined capabilities. They are restricted to internal system events or functions that control system states or functions. Table 4-6 describes the four capabilities.
|Primary||Owned by only one CPU at a time, since the primary could possibly migrate from CPU to CPU in the configuration. For I/O and timekeeping functions, the system requires that the process run on the primary CPU. The process requiring this capability is allowed to run only on the processor that has it at the time.|
|Run||Controls the ability of a CPU to execute processes. Every process requires this resource; if the CPU does not have it, scheduling for that CPU comes to a halt in a recognized state. The command STOP/CPU uses this capability when it is trying to make the CPU quiescent, bringing it to a halted state.|
|Quorum||Used in a cluster environment when a node wants another node to come to a quiescent state until further notice. Like the Run capability, Quorum is a required resource for every process and every CPU in order for scheduling to occur.|
|Vector||Like the primary capability, it reflects a feature of the CPU; that is, that the CPU has a vector processing unit directly associated with it. Obsolete on OpenVMS Alpha and OpenVMS I64 systems but is retained as a compatibility feature with OpenVMS VAX.|
Previously, the use of capabilities was restricted to system resources and control events. However, it is also valuable for user functions to be able to indicate a resource or special CPU function.
There are 16 user-defined capabilities added to both the process and the CPU structures. Unlike the static definitions of the current system capabilities, the user capabilities have meaning only in the context of the processes that define them. Through system service interfaces, processes or individual threads of a multithreaded process, can set specific bits in the capability masks of a CPU to give it a resource, and can set specific bits in the kernel thread's capability mask to require that resource as an execution criterion.
The user capability feature is a direct superset of the current capability functionality. All currently existing capabilities are placed into the system capability set; they are not available to the process through system service interfaces. These system service interfaces affect only the 16 bits specifically set aside for user definition.
The OpenVMS operating system has no direct knowledge of what the
defined capability is that is being used. All responsibility for the
correct definition, use, and management of these bits is determined by
the processes that define them. The system controls the impact of these
capabilities through privilege requirements; but, as with the priority
adjustment services, abusive use of the capability bits could affect
the scheduling dynamic and CPU loads in an SMP environment.
4.4.4 Using the Capabilities System Services
The SYS$CPU_CAPABILITIES and SYS$PROCESS_CAPABILITIES system services provide access to the capability features. By using the SYS$CPU_CAPABILITIES and SYS$PROCESS_CAPABILITIES services, you can assign user capabilities to a CPU and to a specific kernel thread. Assigning a user capability to a CPU lasts either for the life of the system or until another explicit change is made. This operation has no direct effect on the scheduling dynamics of the system; it only indicates that the specified CPU is capable of handling any process or thread that requires that resource. If a process does not indicate that it needs that resource, it ignores the CPU's additional capability and schedules the process on the basis of other process requirements.
Assigning a user capability requirement to a specific process or thread has a major impact on the scheduling state of that entity. For the process or thread to be scheduled on a CPU in the active set, that CPU must have the capability assigned prior to the scheduling attempt. If no CPU currently has the correct set of capability requirements, the process is placed into a wait state until a CPU with the right configuration becomes available. Like system capabilities, user process capabilities are additive; that is, for a CPU to schedule the process, the CPU must have the full complement of required capabilities.
These services reference both sets of 16-bit user capabilities by the common symbolic constant names of CAP$M_USER1 through CAP$M_USER16. These names reflect the corresponding bit position in the appropriate capability mask; they are nonzero and self-relative to themselves only.
Both services allow multiple bits to be set or cleared, or both, simultaneously. Each takes as parameters a select mask and a modify mask that define the operation set to be performed. The service callers are responsible for setting up the select mask to indicate the user capabilities bits affected by the current call. This select mask is a bit vector of the ORed bit symbolic names that, when set, states that the value in the modify mask is the new value of the bit. Both masks use the symbolic constants to indicate the same bit; alternatively, if appropriate, you can use the symbolic constant CAP$K_USER_ALL in the select mask to indicate that the entire set of capabilities is affected. Likewise, you can use the symbolic constant CAP$K_USER_ADD or CAP$K_USER_REMOVE in the modify mask to indicate that all capabilities specified in the select mask are to be either set or cleared.
For information about using the SYS$CPU_CAPABILITIES and
SYS$PROCESS_CAPABILITIES system services, see the HP OpenVMS System Services Reference Manual: A--GETUAI and
HP OpenVMS System Services Reference Manual: GETUTC--Z.
4.4.5 Types of Affinity
There are two types of affinity: implicit and explicit. This section
18.104.22.168 Implicit Affinity
Implicit affinity, sometimes known as soft affinity, is a variant form of the original affinity mechanism used in the OpenVMS scheduling mechanisms. Rather than require a process to stay on a specific CPU regardless of conditions, implicit affinity maximizes cache and translation buffer (TB) context by maintaining an association with the CPU that has the most information about a given process.
Currently, the OpenVMS scheduling mechanism already has a version of implicit affinity. It keeps track of the last CPU the process ran on and tries to schedule itself to that CPU, subject to a fairness algorithm. The fairness algorithm makes sure a process is not skipped too many times when it normally would have been scheduled elsewhere.
The Alpha architecture lends itself to maintaining cache and TB context that has significant potential for performance improvement at both the process and system level. Because this feature contradicts the normal highest-priority process-scheduling algorithms in an SMP configuration, implicit affinity cannot be a system default.
The SYS$SET_IMPLICIT_AFFINITY system service provides implicit affinity support. This service works on an explicitly specified process or kernel thread block (KTB) through the pidadr and prcnam arguments. The default is the current process, but if the symbolic constant CAP$K_PROCESS_DEFAULT is specified in pidadr, the bit is set in the global default cell SCH$GL_DEFAULT_PROCESS_CAP. Setting implicit affinity globally is similar to setting a capability bit in the same mask, because every process creation after the modification picks up the bit as a default that stays in effect across all image activations.
The protections required to invoke SYS$SET_IMPLICIT_AFFINITY depend on
the process that is being affected. Because the addition of implicit
affinity has the same potential as the SYS$ALTPRI service for affecting
the priority scheduling of processes in the COM queue, ALTPRI
protection is required as the base which all modification forms of the
serve must have to invoke SYS$SET_IMPLICIT_AFFINITY. If the process is
the current one, no other privilege is required. To affect processes in
the same UIC group, the GROUP privilege is required. For any other
processes in the system, the WORLD privilege is required.
22.214.171.124 Explicit Affinity
Even though capabilities and affinity overlap considerably in their functional behavior, they are nonetheless two discrete scheduling mechanisms. Affinity, the subsetting of the number of CPUs on which a process can execute, has precedence over the capability feature and provides an explicit binding operation between the process and CPU. It forces the scheduling algorithm to consider only the CPU set it requires, and then applies the capability tests to see whether any of them are appropriate.
Explicit affinity allows database and high-performance applications to segregate application functions to individual CPUs, providing improved cache and TB performance as well as reducing context switching and general scheduling overhead. During the IPL 8 scheduling pass, the process is investigated to see to which CPUs it is bound and whether the current CPU is one of those. If it passes that test, capabilities are also validated to allow the process to context switch. The number of CPUs that can be supported is 32.
The SYS$PROCESS_AFFINITY system service provides access to the explicit affinity functionality. SYS$PROCESS_AFFINITY resolves to a specific process, defaulting to the current one, through the pidadr and prcnam arguments. Like the other system services, the CPUs that are affected are indicated through select_mask, and the binding state of each CPU is specified in modify_mask.
Specific CPUs can be referenced in select_mask and modify_mask using the symbolic constants CAP$M_CPU0 through CAP$M_CPU31. These constants are defined to match the bit position of their associated CPU ID. Alternatively, specifying CAP$K_ALL_ACTIVE_CPUS in select_mask sets or clears explicit affinity for all CPUs in the current active set.
Explicit affinity, like capabilities, has a permanent process as well as current image copy. As each completed image is run down, the permanent explicit affinity values overwrite the running image set, superseding any changes that were made in the interim. Specifying CAP$M_FLAG_PERMANENT in the flags parameter indicates that both the current and permanent processes are to be modified simultaneously. As a result, unless explicitly changed again, this operation has a scope from the current image through the end of the process life.
For information about the SYS$SET_IMPLICIT_AFFINITY and
SYS$PROCESS_AFFINITY system services, see the HP OpenVMS System Services Reference Manual: A--GETUAI and
HP OpenVMS System Services Reference Manual: GETUTC--Z.
4.5 Using the Class Scheduler in CPU Scheduling
The class scheduler gives you the ability to limit the amount of CPU time that a system's users may receive by placing the users into scheduling classes. Each class is assigned a percentage of the overall system's CPU time. As the system runs, the combined set of users in a class are limited to the percentage of CPU execution time allocated to their class. The users may get some additional CPU time if the qualifier /WINDFALL is enabled for their scheduling class. Enabling the qualifier /WINDFALL allows the system to give a small amount of CPU time to a scheduling class when a CPU is idle and the scheduling class's allotted time has been depleted.
To invoke the class scheduler, you use the SYSMAN interface. SYSMAN allows a user to create, delete, modify, suspend, resume, and display scheduling classes. Table 4-7 shows the SYSMAN command, class_schedule, and its subcommands.
|Add||Creates a new scheduling class|
|Delete||Deletes a scheduling class|
|Modify||Modifies the characteristics of a scheduling class|
|Show||Shows the characteristics of a scheduling class|
|Suspend||Suspends temporarily a scheduling class|
|Resume||Resumes a scheduling class|
The full specifications for Class_Schedule and its subcommands are as
126.96.36.199 The Add Subcommand
The format for the Add subcommand is as follows:
SYSMAN>class_schedule add "class name" /cpulimit = ([primary], [h1-h2=time%],[h1=time%], [,...],[secondary],[h1-h2=time%],[h1=time%],[,...]) [/primedays = ([no]day[,...])] [/username = (name1, name2,...name"n")] [/account = (name1, name2,...name"n")] [/uic = (uic1,uic2,...uic"n")] [/windfall]
The class name is the name of the scheduling class. It must be specified and the maximum length for this name is 16 characters.
Table 4-8 shows the qualifiers and their meanings for this SYSMAN command.
Defines the maximum amount of CPU time that this scheduling class can
receive for the specified days and hours. You must specify this
qualifier when adding a class.
The h1-h2=time% syntax allows you to specify a range of hours followed by the maximum amount of CPU time (expressed as a percentage) to be associated with this set of hours. The first set of hours after the keyword PRIMARY specifies hours on primary days; the set of hours after the keyword SECONDARY specifies hours on secondary days. The hours are inclusive; if you class schedule a given hour, access extends to the end of that hour.
|/PRIMEDAYS||Allows you to define which days are primary days and which days are secondary days. You specify primary days as MON, TUE, WED, THU, FRI, SAT, and SUN. You specify secondary days as NOMON, NOTUE, NOWED, NOTHU, NOFRI, NOSAT, and NOSUN. The default is MON through FRI and NOSAT and NOSUN. Any days omitted from the list take their default value. You can use the DCL command, SET DAY, to override the class definition of primary and secondary days.|
|/USERNAME||Specifies which user is part of this scheduling class. This is part of a user's SYSUAF record.|
|/ACCOUNT||Specifies which user is part of this scheduling class. This is part of a user's SYSUAF record.|
|/UIC||Specifies which users are part of this scheduling class. This is part of a user's SYSUAF record.|
Specifies that all processes in the scheduling class are eligible for
windfall. By enabling windfall, you allow processes in the scheduling
class to receive a "windfall," that is, a small percentage of CPU time,
when the class' allotted CPU time has been depleted and a CPU is idle.
Rather than let the CPU remain idle, you might decide that it is better
to let these processes execute even if it means giving them more than
their alloted time.
The default value is for windfall to be disabled.
The format for the Delete subcommand is as follows:
SYSMAN>class_schedule delete "class name"
The Delete subcommand deletes the scheduling class from the class
scheduler database file, and all processes that are members of this
scheduling class are no longer class scheduled.
188.8.131.52 The Modify Subcommand
The format for the Modify subcommand is as follows:
SYSMAN>class_schedule modify "class name" /cpulimit = ([primary], [h1-h2=time%],[h1=time%], [,...],[secondary],[h1-h2=time%],[h1=time%],[,...]) [/primedays = ([no]day[,...])] [/username = (name1, name2,...name"n")] [/account = (name1, name2,...name"n")] [/uic = (uic1,uic2,...uic"n")] [/(no)windfall]
The Modify subcommand changes the characteristics of a scheduling class. The qualifiers are the same qualifiers as for the add subcommand. To remove a time restriction, specify a zero (0) for the time percentage associated with a particular range of hours.
To remove a name or uic value, you must specify a minus sign in front
of each name or value.
184.108.40.206 The Show Subcommand
The format for the Show subcommand is as follows:
SYSMAN>class_schedule show [class name] [/all] [/full]
Table 4-9 shows the qualifiers and their meanings for this SYSMAN command.
|/ALL||Displays all scheduling classes. The qualifier must be specified if no class name is given.|
|/FULL||Displays all information about his scheduling class.|
By default, a limited display of data is shown by this subcommand. The default shows the following:
The format for the Suspend subcommand is as follows:
SYSMAN>class_schedule suspend "class name"
The Suspend subcommand suspends the specified scheduling class. All
processes that are part of this scheduling class remain as part of this
scheduling class but are granted unlimited CPU time.
220.127.116.11 The Resume Subcommand
The format of the Resume subcommand is as follows:
SYSMAN>class_schedule resume "class name"
The Resume subcommand complements the suspend command. You use this
command to resume a scheduling class that is currently suspended.
4.5.2 The Class Scheduler Database
The class scheduler database is a permanent database that allows
OpenVMS to class schedule processes automatically after a system has
been booted and rebooted. This database resides on the system disk in
SYS$SYSTEM: VMS$CLASS_SCHEDULE.DATA. SYSMAN creates this file as an RMS
indexed file when the first scheduling class is created by the SYSMAN
command, class_schedule add.
18.104.22.168 The Class Scheduler Database and Process Creation
By using a permanent class scheduler, a process is placed into a scheduling class, if appropriate, at process creation time. When a new process is created, it needs to be determined whether this process belongs to a scheduling class. Since to determine this relies upon data in the SYSUAF file, and the Loginout image already has the process' information from this file, Loginout class schedules the process if it determines that the process belongs to a scheduling class.
There are two other types of processes to consider during process creation: subprocess and detached process. A subprocess becomes part of the same scheduling class as the parent process, even though it may not match the class's criteria. That is, its user and account name and/or UIC may not be part of the class's record. A detached process only joins a scheduling class if it executes the Loginout image (Loginout.exe) during process creation.