From:	CRDGW2::CRDGW2::MRGATE::"SMTP::CRVAX.SRI.COM::RELAY-INFO-VAX" 23-JUN-1989 22:30
To:	MRGATE::"ARISIA::EVERHART"
Subj:	Re: Obscure VS2000/VWS bug

Received: From KL.SRI.COM by CRVAX.SRI.COM with TCP; Fri, 23 JUN 89 16:15:51 PDT
Received: from ucbvax.Berkeley.EDU by KL.SRI.COM with TCP; Fri, 23 Jun 89 16:09:33 PDT
Received: by ucbvax.Berkeley.EDU (5.61/1.37)
	id AA29738; Fri, 23 Jun 89 15:57:43 -0700
Received: from USENET by ucbvax.Berkeley.EDU with netnews
	for info-vax@kl.sri.com (info-vax@kl.sri.com)
	(contact usenet@ucbvax.Berkeley.EDU if you have questions)
Date: 23 Jun 89 16:24:38 GMT
From: mcvax!cernvax!paul@uunet.uu.net  (paul burkimsher)
Organization: CERN European Laboratory for Particle Physics, CH-1211 Geneva, Switzerland
Subject: Re: Obscure VS2000/VWS bug
Message-Id: <1027@cernvax.UUCP>
References: <22400D56_00100388.009268E71B2501E0$17_1@UK.AC.RHBNC.VAXB>
Sender: info-vax-request@kl.sri.com
To: info-vax@kl.sri.com

Obscure VS2000/VWS bug

I'm sorry to say that upgrading won't help you.
We run VMS V5.0-2 and VWS V4.0 and still get all of the problems you both
describe. In addition, VWS$EMULATORS is prone to eating up 50% (fifty percent)
of your cpu while doing no useful work. The only way out of that one is to
reboot too.

I picked up the following:

----------------------------------------------------------------------------

Note 9.3                        Bizarre behavior                          3 of 7
FNALF::NAGY "Frank J. Nagy, VAX Guru & Wizard"       13 lines  20-MAY-1988 06:18
                          -< Try increasing PAGEDYN >-                          
                                                                                
We saw these sort of problems when we got the first workstations and            
were running VWS V3.0.  The problem in that case was that we had not            
created enough paged dynamic memory.  We increased PAGEDYN and things           
got a lot better.                                                               
                                                                                
At the Fall '87 DECUS Symposia, the "VWS Hints and Kinks" session               
provided the information that for use with VWS, PAGEDYN should run              
in the range of 1.5-2.5 MBytes (with the lower range used for color             
and grey scale systems for some reason).                                        
                                                                                
My monochrome VS-II has 13 MB of main memory.  SHOW MEMORY shows                
Paged Dynamic Memory of 1901056 with 1542032 bytes in use.  I frequently        
run with 4-5 WTAn windows with no problems.                                     
 End of note                                                                    
                                                                                
Note 9.5                        Bizarre behavior                          5 of 7
VXCRNB::TIMBL "Tim Berners-Lee CERN/DD"              11 lines  25-MAY-1988 03:52
                           -< Paged dymanic memory >-                           
                                                                                
    In response to 9.3, Paged Dynamic memory affecting vaxstation hangs.        
                                                                                
    You mention your PAGEDYN usage as 1.5MB out of 1.9MB total. With 3          
    windows, I have 3.10 MB out of 3.15MB total. The 48k left decreases by      
    about 11k per normal 24x80 Vt200 window. When it is exhausted, an           
    attempt to create a new window will make it be drawn and then disappear     
    again.                                                                      
                                                                                
    As the wide windows hang only when sufficient other windows exist,          
    it certainly looks like a resources problem, but not necessarily            
    PAGEDYN (as when it has hung there is still PAGEDYN left).                  
 End of note                                                                    

Note 9.6                        Bizarre behavior                          6 of 7
20514::BUCLIN "Service Informatique, DI-EPFL"        12 lines  31-MAY-1988 15:41
                  -< Have you check the cluster parameters ? >-                 
                                                                                
< Note 9.5 by VXCRNB::TIMBL "Tim Berners-Lee CERN/DD" >                         
>    As the wide windows hang only when sufficient other windows exist,         
>    it certainly looks like a resources problem, but not necessarily           
>    PAGEDYN (as when it has hung there is still PAGEDYN left).                 
                                                                                
                                                                                
  You have mentionned that your stations were diskless. That means that         
  your system needs a lot of IRP, LRP for the LAVC traffic. If you use the      
  Workstation Option Menu to create your new window, a new process is           
  created too. This requires a lot of traffic to set up the swap an page area   
  for this process. If your look aside lists are big enough, another reason     
  may be found in the cluster parameters (lock manager data base, etc ...).     

Note 9.7                        Bizarre behavior                         7 of 7 
FNALC::OLEYNIK                                        3 lines  18-JUL-1988 14:47
                   -< Banner program can hang workstations >-                   
                                                                                
    Workstations running "Banner" can hang up A number of workstations          
    running the "Banner" program have hung up. To free the station, you         
    must stop "Banner" from another terminal. Better yet, don't run banner.     
 End of note                                                                    

Note 9.10                       Bizarre behavior                        10 of 14
FNAL::CHADWICK "Keith Chadwick, (312) 840-2498"      24 lines  18-JUL-1988 21:27
                      -< Parameters - More Knobs to Turn >-                     
                                                                                
        Recently Frank Nagy helped me track down an intermittant problem        
        which would occasionally "hang" the VAXstation display when the         
        UIS$EMULATORS and UIS$DISPLAY_MANAGER process went into RWAST states.   
        The solution was to modify the SYSUAF parameters:                       
                                                                                
                Parameter    Old Value       New Value                          
                                                                                
                BIOLM           18              128                             
                DIOLM           18              128                             
                ASTLM           24              384                             
                TQELM           10               32                             
                ELQLM           50              400                             
                                                                                
        While the VWS installation guide mentions that these parameters         
        should be increased, it would appear that the KITINSTAL.COM             
        does not modify the SYSUAF, thus the system manage should               
        review these parameters and increase them as appropriate.               
                                                                                
        I should also mention, for completeness, that another way for the       
        VAXstation to hang is to run out of pool.  After the VAXstation         
        has been up for a day or so, do a SHOW MEMORY and increase the          
        SYSGEN parameters IRPCOUNT, SRPCOUNT, and LRPCOUNT if appropriate.      
                                                                                
                                -Keith,                                         


Note 9.11                       Bizarre behavior                        11 of 14
UIHEPD::DDL "Dave Lesny"                             17 lines  19-JUL-1988 12:25
                      -< Process priority is the problem >-                     
                                                                                
We also have had problems with BANNER hanging our VAXstations. The problem      
was very sporadic. Eventually I tracked it down to process priority. By default 
BANNER runs at priority 1. If you are running a compute bound process at        
a priority higher than 1, BANNER does not get any cpu time. If however BANNER   
does get some time because the cpu bound process becomes unrunnable (does       
some I/O), starts updating the display, and then the cpu bound process becomes  
runnable, the entire VAXstation appears to be hung. I assume this has to do     
with ASTs not being delievered to BANNER since it cannot be scheduled to run    
and the I/O to the display never completes.                                     
                                                                                
Since we like to use BANNER (users like the cpu histogram and clock), I         
changed some defaults in BANNER.COM. I changed the priority to 4 (our normal    
interactive priority - batch runs at 2), turned off the cube and changed        
the update cycle to 5 seconds. Since doing this we have not had any problems    
with either VWS 3.2 or VWS 3.3.                                                 
                                                                                
dave lesny                                                                      

----------------------------------------------------------------------------

With all this received wisdom and still getting problems, we decided to
report it to DEC and sent them crashdumps.

They said:

Our SYSGEN param CTLPAGES was too small (was 500, now 1500)
Our PQL_DPGFLQUOTA too small (was 2k now 9k)
$Define/sys/exec uis$p1_pool_size N
where n=old N + 512*(your_increase_in_ctlpages)
(was 150000 now 662000)

Always re-install VWS after VMS upgrades (even minor ones).


So we did all this and we STILL get problems... :-(

My unsophisticated rule of thumb is to increase NPAGEDYN. (What? Even further?
But I've already increased it Nx100%!) 
It seems to delay the evil day SOMETIMES (=but not always).


Paul Burkimsher
Cern, Geneva, Switzerland.
paul@online.decnet.cern.ch
vxcern::paul (HEPNET)
paul%online@cernvax (BITNET)
paul%cern.online@uk.ac.ean-relay (Janet)
...!cernvax!paul (UUCP)