From: Jamie Hanrahan [jeh@cmkrnl.com] Sent: Friday, August 13, 1999 10:59 PM To: Jan Bottorff; ntdev Subject: RE: [ntdev] Timekeeping in NT (was: DDK documentation 'hole') > From: owner-ntdev@atria.com [mailto:owner-ntdev@atria.com]On Behalf Of > Jan Bottorff > Sent: Wednesday, August 11, 1999 18:05 > My view is KeQuerySystemTime() is an abstract TOD function. Does anybody > else see any alternative kernel TOD function? I should not have to know > anything about it's underlying architecture to get an accurate TOD value. > My expectation of a modern TOD function is it's absolute error relative to > UTC may be not so good, but's it's short term relative measurement should > be maybe a few microseconds. In 1982, old 8088 DOS boxes kept short term > relative time accurate to about 55 milliseconds (18.2 ticks/sec). In about > 1986, the 286 based IBM AT improved relative resolution to about 1 > microsecond (you could read back the timer chip current count). Are we > saying the relative time keeping ability of current computers running NT > has only improved 5x in the last 17 years? There are basically two ways that NT could get better resolution out of KeQuerySystemTime. One: Run the interval timer faster and add smaller numbers to the system time on every tick, instead of 100,000 as is done usually. Hey, wait, you can have that today if you want, at least you can have a factor of 10 improvement. Just use the "multimedia" timers from user mode (timeSetTime and all that), requesting a 1 msec delay. As of NT5 you'll be able to make the same tweak from kernel mode by calling ExSetTimerResolution. Either way, ALL timed events in the system (except quantum expiration) get evaluated to the finer resolution from then on (until the next boot). But the cost is 1000 interrupts per second from the clock, instead of just 100. Since the clock is serviced at a higher IRQL than any device, this DOES have a measurable impact on interrupt latency on other devices, even on modern CPUs. You see, interrupt latency and the cache-hit costs of servicing interrupts and so on just don't scale linearly with processor speed. Modern processors are several hundred times faster than the first PCs. That doesn't mean they can handle even ten times the interrupt rates of the first PCs with impunity. You seem awfully willing to give away my processor cycles for your convenience. Experience has shown that this isn't a good tradeoff. Two: Provide something like the Pentium cycle counter -- a fine-grained thing that will tell you where you are "within" an OS tick -- but make it readable without raising MP synch or other performance issues. This would not let you implement timed requests to any better granularity (the system is still only going to check "is the earliest event in the timer list due?" on every "tick" interrupt) but would let GetSystemTime and so on return a higher-res time. Great plan -- but current PC hardware doesn't have such a beast. In any case, in a multitasking, interrupt-driven OS it does little good for a simple returned value of "time" to be expressed with a much finer resolution than the thread scheduling timeslice. So the system fetches a value of "time" that's measured down to the microsecond. So what? By the time you get to look at it, many microseconds... or even tens of milliseconds... might have elapsed since the time value was fetched. So the value seen by the requester won't be accurate, even if it was accurate when it was read. > If I ask my Linux box what it thinks about the TOD (by > running ntptime), it suggests it knows the relative time to about microsecond > resolution, and is synced to absolute time (via NTP) with an estimated error of 132192 > microseconds and a maximum error of 350904 microseconds. "Suggests" being the operative word here. First, the ludicrous "error" figures. An estimated error of 132192 microseconds. Gee, is the program sure it isn't really 132193? If those figures were honest they'd be rounded off to two significant digits. At most. They're just the result of averaging a series of numbers, none of which had anywhere near six significant digits to begin with. An "estimated error" figure quoted to six apparently-significant digits with a "maximum error" of over twice the "estimated" value is self-evidently ludicrous; the digits past the first two or so tell us nothing. (Well, let me take that back. THey DO tell us that the programmer knew nothing about "accuracy", "resolution", or significant digits.) If you like, you can have NT sync its clock to an NNTP server as often as you care to (the tools are in the resource kit). You can then run a similar utility on NT and get the same result (bogus "estimated error" and all). As for "Knows the relative time to about a microsecond" -- what does that mean? Does it mean it can measure elapsed times to the microsecond? Well, maybe, but so can NT (KeQueryPerformanceCounter). But neither result has anything to do with the granularity at which the operating systems are counting ticks. Or the value you'll get from KeSystemTime, or from time() on Linux. Try doing a "sleep" on your Linux system for 3 microseconds and see what happens. Better yet, call time() in a tight loop for a second or so, recording the results *in memory* (don't write them out anywhere until the end of the run), and see the "grain" with which the time advances. Microseconds? I don't think so. > It sounds like NT can't keep track of absolute or relative time any better than 10 > milliseconds, even if my hardware can. As I said, the standard NT timekeeping -- without resort to the Pentium cycle counter, etc. -- can actually get down to millisecond resolution if a multimedia timer request has been made that requires that. As of NT 5, ExSetTimeResolution is exposed to kernel mode, so this can be done from k-mode also. But the cost is, of course, 10x the number of timer interrupts per second. There is a very good reason that NT doesn't run this way by default: Tests have shown that IO performance suffers; interrupt latency in particular. > NT also seems unable to tell me anything about the resolution and absolute > accuracy of it's TOD function. For the important function of time keeping, > Linux seems to be extremely more capable than NT. Oh? I quote from the "Linux FUD FAQ" (http://www1.linkonline.net/rodpad/linux02.html) : > The basic unit of time in Linux (and most Unix-like systems) is time_t. This format > expresses the time as the number of *seconds* [my emphasis - jeh] since midnight, 1 Jan, > 1970. In other words, if we ask the Linux kernel for the time, we're going to get a time_t, which advances every second. I think you mean "Linux with an external connection to an nntp server" will let you find out what time it is wth greater accuracy than will an NT system without such a connection. Well, no kidding. NT with such a connection will give you the same capability. But in neither case is the progression of the system time counter, nor the expiration of timed events, handled with any greater resolution than without such a connection. > A few moments of web surfing finds RFC 1589 > (http://andrew2.andrew.cmu.edu/rfc/rfc1589.html), which > describes just wonderful detail about accurate time keeping on computers. Wherein it states quite clearly that Unix kernels run on interval timers with pretty much the same resolution as NT (remember, NT will get down to 1 msec if you ask): > In order to understand how the new software works, it is useful to review how most Unix > kernels maintain the system time. In the Unix design a hardware counter interrupts the > kernel at a fixed rate: 100 Hz in the SunOS kernel, 256 Hz in the Ultrix kernel and 1024 Hz > in the OSF/1 kernel. Since the Ultrix timer interval (reciprocal of the rate) does not > evenly divide one second in microseconds, the Ultrix kernel adds 64 microseconds once each > second, so the timescale consists of 255 advances of 3906 us plus one of 3970 us. > Similarly, the OSF/1 kernel adds 576 us once each second, so its timescale consists of 1023 > advances of 976 us plus one of 1552 us. And please note: this article does not propose running the interval timers at any higher rates (and so does not propose that time in these systems progress at any finer resolution). Rather, it proposes a mechanism whereby the amount by which system time is advanced after each clock interrupt -- or the interrupt rate itself -- can be tweaked so as to achieve long-term accuracy in TOD despite inaccuracies and instabilities in the rate at which the clock interrupts. These techniques are nothing new; VMS has been using them for at least ten years. More to the point, they have nothing to do with the resolution by which the OS reckons time -- only the long-term accuracy, two parameters that have very little to do with each other. --- Jamie Hanrahan, Kernel Mode Systems ( http://www.cmkrnl.com/ ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - [ To unsubscribe, send email to ntdev-request@atria.com with body UNSUBSCRIBE (the subject is ignored). ]