From: CRDGW2::CRDGW2::MRGATE::"SMTP::CRVAX.SRI.COM::RELAY-INFO-VAX" 7-SEP-1989 17:28 To: MRGATE::"ARISIA::EVERHART" Subj: Stopping RWAST processes summary Message-Id: <8909072116.AA15611@crdgw1.ge.com> Received: From MGHCCC.HARVARD.EDU by CRVAX.SRI.COM with TCP; Thu, 7 SEP 89 12:31:13 PDT Date: 7 Sep 89 15:03:00 EST From: "SMAUG::GOLD" Subject: Stopping RWAST processes summary To: "info-vax" Hi netland! I'm very new to InfoVAX, so I hope I don't piss off anyone by summarizing some responses that I got for a question that I posted. I wasn't sure if I should include the respondees' names and mail addresses or not. Here's the question that I posted: > Does anyone know how I can stop a process that is in "RWAST" state > without rebooting the system? The VMS command: STOP/ID=pid does not > work on processes in this state. The particular processes I need stopped > in the case are MASS11 Wordprocessing quick-print processes that are > messing up other Users on the same account. Thanks! > > - Mark Gold > INTERNET: gold%smaug.decnet@mghccc.harvard.edu Here are the suggestions that I received: 1) ********************************************************* The process IS stopped. Is you do ANALIZE/SYSTEM, SHO SUM and SHO PROC/IND=xx, where xx is the index of the process you have stopped, you'll see that the status is ...,DELPEN,RESPEN,... This means the the process is just waiting for some system resources to be returned. If you want to delete the process from the system table, well, that's something I'd like to know as well. There was a method under VMS 4 that does not work on VMS 5 any more. It consisted of setting the SSRWAIT bit of the process status longword using DELTA. 2) ********************************************************* You don't, without rebooting. This has been gone into before. Only the most propellor-headed of propellor heads can do it, and even then it is fraught with dangers. Forget it. 3) ********************************************************** I once saw a procedure using SDA and XDELTA, wherin you located the PSL of the stuck process and cleared the resource wait bit in it. The details of that process have since been lost but I imagine a bit of reading in the SDA and XDELTA reference manuals would provide sufficient information to allow an intrepid system manager to get started. The RWAST (resource wait AST) state is usually a sign that something's amiss in your system. Processes get stuck in this state when some resource is unavailable when they request it and the release of the resource is never signaled. If you are having repeated instances of this problem you're going to spend more time killing processes than is reasonable. An investigation into the root cause seems advised. 4) *********************************************************** This is the `fabled' Bruce Ellis method of wasting an RWASTed process. There are several caveats. The procedure must be followed exactly. From reports I've heard, this either solved the problem or crashed the system. 1) Stop/Id=pid the RWASTed process to set the delete pending bit in the process header. 2) Run SDA to obtain some process information. $ Analyze/System SDA> Set Process RWASTed_process_name SDA> Show Process SDA> Exit Write down the PCB address and the Internal PID. Example output: $ ana/sys VAX/VMS System analyzer SDA> set proc "Reality" SDA> sh proc Process index: 0040 Name: Reality Extended PID: 20200A40 ------------------------------------------------------------ Process status: 02040001 RES,PHDRES PCB address 8037CC00 JIB address 804BDBB0 PHD address 808F4C00 Swapfile disk address 00000000 Master internal PID 00140040 Subprocess count 0 Internal PID 00140040 Creator internal PID 00000000 Extended PID 20200A40 Creator extended PID 00000000 State LEF Termination mailbox 0000 Current priority 9 AST's enabled KESU Base priority 4 AST's active NONE UIC [00002,000006] AST's remaining 21 Mutex count 0 Buffered I/O count/limit 17/18 Waiting EF cluster 0 Direct I/O count/limit 18/18 Starting wait time 1B001B1B BUFIO byte count/limit 20448/20800 Event flag wait mask DFFFFFFF # open files allowed left 20 Local EF cluster 0 C000002F Timer entries allowed left 20 Local EF cluster 1 80000000 Active page table count 0 Global cluster 2 pointer 00000000 Process WS page count 134 Global cluster 3 pointer 00000000 Global WS page count 60 SDA> Exit 3) Rub rabbit's foot. 4) Run Delta and flip the ssrwait bit in the status field of the PCB. You may ask, "How the @#$% do I do that?" Delta is a fun program. In fact, it's so bizarre that I'll show an example of the output and annotate it. (This example involves poking the process shown in the above ANA/SYS example. This process is NOT in RWAST but it shows how the process works.) $ r sys$library:delta DELTA Version 5.0 1;m (A) 00000001 00140040: 8037cc24/02040001 02040101 (B) exit (C) $ (A) Run Delta and do a sanity check. The EXACT keypresses here are: 1;m (B) Here is where the action is. Type the internal PID followed by a space and the PCB + 24 (the ofset to PCB$L_STS). Type a /. Delta then displays the PSL. Calculate PSL.or.100 (ssrwait bit). Type the result followed by a return. The EXACT keypresses for this example are: 00140040: 8037cc24/02040101 (C) Scram! (If you get this far.) Bob and Doug McKenzie fans will love Delta's error message. `EH?' Use at your own risk! 5) ********************************************************** You have to figure out what resource the job needs and free it up before you can stop the job. This may not be easy. Do you have a Dec software maintenance contract? If so, there are some articles on DSIN that describe possible ways to find out what might be the problem.