Chapter 1 describes the basic issues that concern a realtime application, and what services a realtime operating system can provide to users to help meet their realtime needs. It mainly describes issues within the scope of the user's application code itself, such as how to set priority and scheduling priorities, how to lock down process memory, and how to use asynchronous I/O. Chapter 1 also discusses the value of a preemptive kernel in reducing the process preemption latency of a realtime application.
This chapter explores more deeply the latency issues of a system and
how they affect the realtime performance of an application.
This involves
a greater understanding of the interaction of the application with the underlying
UNIX system, and with devices involved directly or indirectly with the application.
Section 11.2
outlines some ways that a user can improve
application performance.
11.1 Realtime Responsiveness
Realtime applications require a predictable response time to external events, such as device interrupts. A typical realtime application involves:
An interrupt-generating device
An interrupt service routine that collects data from the device
User-level code that processes the collected data
Realtime responsiveness is a characterization of how quickly an operating system and an application, working together, can respond to external events. One way of measuring responsiveness is through a system's latency. Latency is the time it takes for hardware and the operating system to respond to external events, expressed as a delay time. Understanding the causes of high latency and minimizing their effects is a key to successful realtime program design, and is the focus of this chapter.
Two types of latency are described in the following sections:
Interrupt service routine (ISR) latency
Process dispatch latency (PDL)
11.1.1 Interrupt Service Routine Latency
A system's interrupt service routine (ISR) latency is the elapsed time from when an interrupt occurs until execution of the first instruction in the interrupt service routine. The system must first recognize that an interrupt has occurred, and then dispatch to the ISR code. If critical postprocessing is done in the ISR, then the user must be concerned with completion time of the ISR code, not just the time it takes to begin execution of its first instruction. Thus there are two concerns: ISR latency and ISR execution. There are factors that cause ISR latency and ISR execution to vary in duration, and these factors make it more difficult to assign latency a deterministic value.
The most important factor is the relative interrupt priority level (IPL) at which the ISR executes. When other ISRs of equal or greater interrupt priority level are running at the time that the realtime device interrupts, the realtime device ISR is blocked from running until the current ISR is finished.
Potentially, multiple ISRs could be waiting to execute that have an
equal or higher IPL at the time of the realtime interrupt, and all will hold
off the realtime ISR until they complete.
In addition, after the realtime
ISR begins running, it can be preempted or held off by one or more devices
of higher IPL, and the realtime ISR will be delayed by the collective duration
of these ISRs.
Thus, it is important to know the relative IPLs of all the
devices that could potentially interrupt during critical realtime processing,
including system-provided devices, such as a network driver or disk driver.
11.1.2 Process Dispatch Latency
Process Dispatch Latency (PDL) is the time it takes from when an interrupt occurs until a process that was blocked waiting on the interrupt executes. Process dispatch latency includes:
ISR latency
ISR execution time
Time required to return from the ISR
Time required for the context switch back to the process-level code that is waiting on the interrupt
Many other factors can potentially increase the process dispatch latency of a realtime application. Any process that is currently executing code that holds a simple lock, that is funneled to the master process, or that has its IPL raised, will not be preemptable by the realtime process and thus will hold off the realtime process from running. (Note that a user process cannot hold a simple lock, be funneled to the master process, or have its IPL raised, except through a system call.) When the process is able to run, it must compete against other processes in order to actually run, and the process with the highest priority will run.
Note that process priority can affect PDL but cannot affect ISR latency.
In other words, no matter how high the priority of an application process,
even if it is in the realtime priority range, all ISRs that need servicing
at the time that the realtime device's ISR needs servicing will be serviced
before process code can execute, no matter in what order or at what interrupt
priority level the ISRs run.
11.2 Improving Realtime Responsiveness
This section contains guidelines for improving realtime responsiveness.
Minimize Paging by Locking Down Memory
Be sure that your system has sufficient memory, and always lock down
memory in the user process to reduce paging.
Paging will occur when there
are many threads and processes running on the system that do not collectively
fit into system memory, and must be paged in and out as necessary.
Application
code and data that are locked in memory will not be paged.
Paging affects
process dispatch latency because it executes code in the kernel that is protected
by simple locks, and thus cannot be preempted.
Note that certain system daemons
are not locked in memory, so a secondary effect is paging from those systems.
Turn On Kernel Preemption
Turn on kernel preemption and set your application code scheduling priority
to SCHED_FIFO.
This is described in
Chapter 2.
Manage Priorities
Always consider the process priority level of your application in terms of relative importance in the overall system. You may need to use priorities in the realtime range. This affects process dispatch latency when there are other processes ready to run at the same time that the realtime application is ready to run. The process with the highest priority that has been waiting the longest among the waiting processes of that priority will run first.
Note, however, that always running in the realtime priority range is not necessarily what you should do. If you need to interact with system services that have threads or processes associated with them, such as the network, you need to run at a priority at or below the priority of those threads or processes, as well as at or below the priority of anything on which those threads or processes depend.
The kernel contains multiple threads. The purpose of these threads is to perform activities that have the potential of blocking, and thus serve as the delivery mechanism of information between ISRs and user processes. These kernel threads do not have much of the state information that processes have.
Kernel threads use the first-in/first-out scheduling policy, and are scheduled along with POSIX processes. The kernel sets priorities as Mach priorities, which are the inverse of POSIX priorities: 0 is the highest-priority Mach thread and 63 is the lowest. Under POSIX, 64 is the highest priority and 0 is the lowest.
You can use the
ps
command to display thread priorities.
Because the
ps
program predates the use of threads, its
ability to display information clearly about threads is limited.
The following
example shows an example of using the command
ps axm -o L5FMT,psxpri
to display L5FMT format and append the POSIX priority field:
% ps axm -o L5FMT,psxpri F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD PPR 3 R < 0 0 0 0.0 32 -12 0 3.4M * ?? 05:02:40 kernel idle 31 R N 0.0 63 19 - 0:00.00 0 U < 0.0 38 -6 malloc_ 0:00.51 25 U < 0.0 32 -12 402cb0 0:49.47 31 U < 0.0 32 -12 402eac 0:00.00 31 S < 0.0 33 -11 netisr 05:01:23 30 U < 0.0 32 -12 3e3f18 0:00.00 31 U < 0.0 38 -6 4c3b80 0:00.00 25 U 0.0 42 0 ubc_dir 0:00.52 21 U < 0.0 37 -7 4c2678 0:00.01 26 U < 0.0 37 -7 4c2680 0:03.77 26 U < 0.0 38 -6 4c33b0 0:12.69 25 U < 0.0 32 -12 4e36d8 0:00.01 31 U < 0.0 37 -7 4e36d8 0:00.12 26 U < 0.0 37 -7 4ba2d8 0:00.00 26 U < 0.0 38 -6 4e3078 0:00.00 25 U < 0.0 42 -2 24ce30 0:00.03 21 I 0.0 42 0 nfsiod_ 0:01.49 21 I 0.0 42 0 nfsiod_ 0:01.65 21 I 0.0 42 0 nfsiod_ 0:01.82 21 I 0.0 42 0 nfsiod_ 0:00.61 21 I 0.0 42 0 nfsiod_ 0:01.71 21 I 0.0 44 0 nfsiod_ 0:01.26 19 I 0.0 42 0 nfsiod_ 0:01.78 21 80048001 I 0 1 0 0.0 44 0 0 40K pause ?? 0:03.12 init 19 8001 IW 0 3 1 0.0 44 0 0 0K sv_msg_ ?? 0:00.12 kloadsrv 19 8001 S 0 17 1 0.0 44 0 0 48K pause ?? 03:58:06 update 19 8001 I 0 81 1 0.0 44 0 0 120K event ?? 0:02.64 syslogd 19 8001 IW 0 83 1 0.0 42 0 0 0K event ?? 0:00.03 binlogd 21 8001 S 0 135 1 0.0 44 0 0 80K event ?? 8:13.21 routed 19 8001 S 0 226 1 0.0 44 0 0 104K event ?? 8:25.31 portmap 19 8001 IW 0 234 1 0.0 44 0 0 0K event ?? 0:00.21 ypbind 19
.
.
.
You can use the
dbx
command from a root account to
display more information about kernel threads, as follows:
# dbx -k /vmunix (dbx) set $pid=0 (dbx) tlist [shows kernel threads] (dbx) tset thread-name;t [shows which routine a thread is running] (dbx) p thread->sched_pri [shows Mach priority for the current thread]
The following example shows use of the
dbx
command:
# dbx -k /vmunix dbx version 3.11.8 Type 'help' for help. stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available warning: Files compiled -g3: parameter values probably wrong (dbx) set $pid=0 (dbx) tlist thread 0xfffffc0003fd1be8 stopped at [thread_run:2388 ,0xfffffc00002a2560] Source not available thread 0xfffffc0003fd6000 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd62c0 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd6580 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd6dc0 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd7080 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd7340 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd7600 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd78c0 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd7b80 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6a000 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6a2c0 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6a580 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6a840 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6ab00 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6adc0 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003fd1950 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6b080 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6b340 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6b600 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6b8c0 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0003f6bb80 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available thread 0xfffffc0000926000 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available (dbx) tset 0xfffffc0003f6bb80;t thread 0xfffffc0003f6bb80 stopped at [thread_block:2020 ,0xfffffc00002a1da0] Source not available > 0 thread_block() ["/usr/sde/osf1/build/ptos.bl8/src/kernel/kern/sched_prim.c":2017, 0xfffffc00002a1d9c] 1 async_io_thread(0x0, 0x0, 0x0, 0x0, 0x0) ["../../../../src/kernel/nfs/nfs_vnodeops.c":2828, 0xfffffc00002f4898] (dbx) p thread->sched_pri 44
By default, the parameter
ubc_maxpercent
in the file
/sys/conf/param.c
is set to 100.
That means
that up to 100 percent of physical memory can be consumed by the Unified Buffer
Cache (UBC) for buffering file data.
Some systems perform better when not
all physical memory is allowed to be taken by the UBC.
For improved realtime responsiveness, change this the value of
/sys/conf/param.c
to between 50 and 80, depending on the amount
of file system activity done on the system.
This can improve system realtime
latency, because when the UBC has consumed its maximum allocation of memory
for buffering file data, the least-recently used buffers must be flushed to
disk if they are modified.
Flushing these buffers is done with a simple lock
held, and therefore can affect process dispatch latency.
The more memory
that the UBC is allowed to use before flushing, the longer it will take to
perform the flushing.
Lowering the value of the
ubc_maxpercent
parameter will cause the flushing to occur more frequently
but take less time.
Write Effective Device Drivers
When writing device drivers, follow these guidelines:
Avoid holding locks for long periods
Holding a lock prevents context switches from occurring.
Avoid funneling
Funneled device drivers take a lock upon entry.
Keep interrupt service routines brief
Consider use of a kernel thread to do ISR postprocessing.
While an ISR
is executing, other interrupts of equal or lower IPL are delayed, and no process
can run until all ISR activity is completed.
Consider use of the
rt_post_callout
function for ISR postprocessing that needs to execute
before any process code, but after any ISRs.
See
System Configuration Supplement: OEM Platforms
or the Tru64 UNIX
Device Driver Kit documentation (available separately from the base operating
system) for more information about the
rt_post_callout
function.
Avoid Configuring Peripheral Devices in the System
Use devices with care that could interfere with realtime responsiveness, such as:
The network driver
Do not configure the network driver into your system if it is not a necessary part of your realtime application. If it is necessary, then be sure that it is used only in postprocessing and not during critical phases of your application, when you are attempting to minimize latency.
The disk driver
Be sure that postprocessing data is written to permanent storage during noncritical sections of your application and that all data is properly flushed and synchronized to disk at appropriate times. See Chapter 8 for more information about synchronized I/O.
In general, keep all peripheral devices that can cause spurious interrupts
out of the configuration of the most critical systems.
Other devices can possibly
cause interrupt latency as well as bus contention with the critical devices.
If other devices are a necessary part of the system, analyze the interrupt
rate and attempt to avoid interrupt overload on the system.
Consider Use of Symmetrical Multiprocessing
Consider a symmetrical multiprocessing (SMP) system as a possible means
of improving realtime responsiveness.
You can divide the application across
multiple processors using the
runon
command.