From: CRDGW2::CRDGW2::MRGATE::"SMTP::CRVAX.SRI.COM::RELAY-INFO-VAX" 27-OCT-1989 18:04 To: MRGATE::"ARISIA::EVERHART" Subj: Re: INDEXF (un-) extension Received: From OUTLAW.UWYO.EDU by CRVAX.SRI.COM with TCP; Fri, 27 OCT 89 12:12:22 PDT Received: from POSSE by OUTLAW with DECNET ; Fri, 27 Oct 89 13:13:23 MDT Date: Fri, 27 Oct 89 13:13:17 MDT From: jimkirk@CORRAL.UWyo.Edu (Jim Kirkpatrick) Message-Id: <891027131317.2040082b@UWYO.BITNET> Subject: Re: INDEXF (un-) extension To: info-vax@sri.com Carl J Lydick recently wrote about the INDEXF.SYS extension problem -- >It strikes me that this is a non-problem. Perhaps I misunderstand Carl, but the root of the problem deals with cluster nodes simultaneously attempting to extend the file. An original posting may be found below. It is the avoidance of this problem that causes us to want to ensure the INDEXF file is never extended on the fly. Contiguity has nothing to do with it. For a cluster, this is not a non-problem! However, thanks for posting the nifty EXTEND program. I may get up enough nerve to try it! And my appologies if I've misunderstood what you said! JK -------------------- Date: Wed, 8 Mar 89 16:56 EST From: Subject: VMS 5.X Cluster Disk Corruption Problems At Indiana University, we have uncovered (or perhaps rediscovered?!) a nasty problem with VMS 5.0 and 5.1 which looks to us like a design oversight in the VMS file management system in a clustered disk environment. It has to do with the INDEXF.SYS file extention by one cluster node followed by another node writing to the same disk. When one views the disk (DIR/DAT etc.) from the original INDEXF file extending node, the error "bad directory file format" is returned! This is apparently due to the fact that the two nodes do not see the same INDEX.SYS file. It likely is caused by caching data not being written to the disk. We feel that should a system crash occur while this "window of error" exists that user files would be corrupted. (The lifetime of this "window" is minutes and eventually is cleared up.) The fact is that we have had files lost with users seeing the error "unsupported file structure level" appearing after a system crash and when they try to use their files. After an ANALYZE/DISK/ REPAIR run, the error message changes to "no such file" which is not much better for the system user. While we have not intentionally crashed the cluster to exactly duplicate the original bashed files problem, I feel that there is a reasonable probability that just such a disk file management snafu could cause it. Apart from any other evidence, this problem should be fixed. Users should not be seeing disk file error messages that "clear up" on their own. Either there is an error or there is not. We can recreate the error window at will. We have reported the problem to Digital but have gotten no fixes as yet. We feel that some of you out there must have seen similar problems and would appreciate some feedback on your solutions. Our workaround approach has been to preallocate large (75,000 block) INDEXF.SYS files to avoid the extention problem. We are running a mixed-mode cluster with VMS 4.7 on three VAXes & A5.0-2 on our two 8820's. We have 52 RA82 disks in the cluster. Chuck Flowers Operating Systems Manager Bloomington Academic Computing Services Indiana University Bloomington, IN 47405-4801