From: SMTP%"RELAY-INFO-VAX@CRVAX.SRI.COM" 24-SEP-1993 11:28:48.88 To: EVERHART CC: Subj: Re: Deleting users stuck in RWAST From: jeh@cmkrnl.com X-Newsgroups: comp.os.vms Subject: Re: Deleting users stuck in RWAST Message-Id: <1993Sep21.122935.2755@cmkrnl.com> Date: 21 Sep 93 12:29:35 PDT Distribution: world Organization: Kernel Mode Systems, San Diego, CA Lines: 162 To: Info-VAX@kl.sri.com X-Gateway-Source-Info: USENET In article <1993Sep20.231543.1@alien.gici.com>, laut@alien.gici.com writes: > In article <1993Sep20.095900.2748@cmkrnl.com>, jeh@cmkrnl.com writes: >> In article <1993Sep10.225830.1@alien.gici.com>, laut@alien.gici.com writes: >>> The problem arises when, for some reason, the thread's transition from IPL#4 >>> to IPL#2 doesn't correctly finish, and so CCB$W_IOC doesn't get decremented. >>> The result is a hung process. >> >> THe usual reason for this is that the outstanding I/O request on the >> "problem" channel couldn't be cancelled or completed. >> > [...] > since VMS is _trusting_ the driver to "clean up after > itself," a driver that -CAN'T- cancel and/or complete I/O has no business > being loaded into your system, because it has a design fault in it. Unfortunately, some such drivers are shipped with VMS! It's not as though we have a choice about them. The hardware can be at fault too. This used to be a very famous "feature" of certain Massbus tape drives: (Am I dating myself, or what?) It was possible to lose an interrupt that would otherwise signal the end of a long non-data-transfer operation, such as a rewind, if someone took the drive off-line during the rewind. For a while the VMS folks were telling the TC03 folks "fix the controller to give us the interrupt anyway" and the TC03 folks were saying "we can't anytime soon, fix the driver to allow cancels"... There are also devices that will lose an interrupt after being programmed for a DMA transfer. The standard rule (written long ago, when Massbus and Unibus were all that existed, and the smartest terminal mux supported on VMS was a DZ11) is that "DMA transfers, once started, cannot be cancelled, you must wait for the interrupt". The only safe way that a driver can violate this rule is to tell the controller "don't do that one" and get a confirmation from the controller that it hears and will obey. In many older (dumb) controllers, and some new controllers too, the only way to do this is via a hard reset on the controller to make it forget *everything* that's happened since system boot. Lots of driver writers are afraid to do this. If you have to wait for a response from the device after the reset, the programming can get tricky, esp. if the driver's cancel I/O routine was written as an afterthought instead of being designed in from the start. (With a device of almost any realistic level of complexity, the driver model offered in DEC's template driver just isn't adequate.) Many driver writers, including some who apparently worked for DEC at one time, just punted the whole issue and assumed that every DMA transfer would complete. There are lots of other examples. My point here is that while it's fine to say "you shouldn't have any buggy inner-mode code in your machine", this is difficult to achieve in practice, except by not booting! >>> The problem arises when, for some reason, the thread's transition from IPL#4 >>> to IPL#2 doesn't correctly finish, and so CCB$W_IOC doesn't get decremented. >>> The result is a hung process. For the record, I've yet to see a case where the IPL 4-> IPL 2 transition just "doesn't correctly finish". Invariably something else is going on -- namely, the target process is waiting at IPL 2, preventing the delivery of ASTs. The process rundown code is supposed to be set up so that anything that might need a wait at IPL 2 will be over and done with before it's time to $CANCEL all outstanding I/Os. Obviously this doesn't always happen. >> (Note that "threads" -- not an official VMS term, in this context anyway -- >> don't really transition from IPL 4 to IPL 2. The IPL 4 interrupt service >> routine runs in system context. It arranges for I/O completion ASTs to be >> delivered to the process that requested the I/O. These ASTs run in the context >> of the target process with a completely different register set, P0 address >> space mapping, etc. and aren't, properly speaking, a continuation of the >> "thread" that was running at IPL 4.) > > The I/O-Post Processor starts at IPL#4, in system context, to > complete its system-specific stuff. Then, it converts the IRP into an ACB > and queues it to the target process, in order to get the process context > mapped, so that it can finish up with things like propogating the IOSB; and > in cases of buffered I/O, copying the system buffer into the user's buffer. > > The procedure is identical in concept to "forking" within system context, > whereby the THREAD is started as a device interrupt, and then forks down to > its Fork IPL so as not to block any other incoming interrupts. > > Really now, Jamie. If I were Ehud, I would likely be accusing you of having > answered my post as you did for your own personal gain or ego gratification. :) No, I answered your post as I did because I believe (based on some years spent full-time and the subsequent years spent part-time teaching and writing about VMS internals and device drivers), that the analogy to forking will lead some readers to form incorrect mental models of this part of VMS. I feel very strongly about getting the terminology and the models right and carefully specifying what they do and do not apply to. Incorrect models and terminology invariably lead to incorrect conclusions about other aspects of VMS's operation. The way I see it, the IPL 4 -> IPL 2 transition is *not* identical to forking, not just because of the different context (system vs. process), but because the IPL 4 code is not over and done with once an IRP has been queued to the requesting process as an ACB. Rather, the IPL 4 code sits in a loop looking for more IRPs on its input queue. The operation of the IPL 4 code *is* very similar to what happens in the *fork dispatching routines* (the IPL 6, 8, 9, 10, and 11 ISRs)... if you want to draw an analogy to forking, that's the one to point to. (I've yet to hear anybody claim that a fork routine was a continuation of the IPL n fork dispatching thread -- it isn't; it's a continuation of the thread that called EXE$FORK.) I think I see where you're getting your view of things, though. The IPL 2 I/O posting special kernel AST routine is a continuation of the IPL 4 thread, in that the IPL 4 thread specifies the code to be executed at IPL 2. This isn't the case in the fork dispatchers. Still, there are enough differences btw this and forking that I think this is a very misleading analogy to draw. Unless of course you take the time to explain the details of where the analogy is valid and where it isn't. (Speaking of fork processes -- damn, I wish they *had* called them "fork threads"! The use of the word "process" in this term leads to far more confusion than we're concerned with here. The only way to deal with this, I've found, is to hit it head-on the first time you introduce the term, and say that while "fork process" is the official term, these things have nothing to do with "processes" as we usually think of them. The "comparison charts" in the DEC Ed. Svcs. materials, contrasting "processes" and "fork processes", do more harm than good...) >> You forgot to mention that if you do this, and THEN the outstanding I/O request >> decides to complete, VMS will most likely crash. This is the reason for the >> "don't delete processes until their I/Os are finished" rule. >> >> --- Jamie Hanrahan, Kernel Mode Systems, San Diego CA > > I did. _You_ deleted it from my post. I didn't see it. Perhaps I was responding to a previous followup that had quoted your post. > Also, you _forgot_ what the original > poster said, namely that this occurs under an ORACLE application coming in > on an RTAn: device, and that the only way he has of fixing the problem is to > reboot the machine. Either way, his problem gets fixed. *grin* yep. This reminds me of something that happened a couple of weeks ago, in a completely different "domain". A client called me in a panic: "I just dumped most of a cup of coffee into my keyboard!" (And, I know that this particular client uses artificial sweetener and creamer.) I couldn't deal with it right then, so I said, "Take the kb to the lunchroom and flood it with warm running water, for at least ten minutes." "Won't that hurt it?" "Most likely not, but it's certainly a lot better than what's in there now; if we do nothing the kb will be a total loss, so let's do SOMEthing..." > However, for the record, just to insure there is no confusion out there in > net.land, let me re-iterate again the importance of everyone realizing that > what I suggested above is truely a LAST-DITCH method of fixing the problem, > because it is pulling an unexpected (ahem) "thread" from the fabric of things. > (%^/) > > -- > Bill Laut Internet: laut@alien.gici.com > Gull Island Consultants, Inc. Phone: (616) 780-3321 > Muskegon, MI 49440 >> "Usual disclaimers, apply within" << --- Jamie Hanrahan, Kernel Mode Systems, San Diego CA drivers, internals, networks, applications, and training for VMS and Windows NT uucp 'g' protocol weenie and release coordinator, VMSnet (DECUS uucp) W.G., and Chair, Programming and Internals Working Group, U.S. DECUS VMS Systems SIG Internet: jeh@cmkrnl.com (JH645) Uucp: uunet!cmkrnl!jeh CIS: 74140,2055