From: Terry Kennedy [terry@gate.tmk.com] Sent: Friday, June 08, 2001 3:55 AM To: Info-VAX@Mvb.Saic.Com Subject: Re: Low level format of SCSI disk, so VMS can read it. Paul Repacholi writes: > Jim Agnew writes: >> i'm told later versions vms can handle this. > > Tolerate, with complaints would be nearer. I'm pretty sure that the driver was changed to soft-set the changeable mode page if it found auto-reallocate set, not to ignore the fact that it was set. I'm pretty sure Glenn Everhart did the change while at DEC, but I'm not 100% positive of that. > If the drive replaces the failed block, you have NO indication that > the data is bogus. If the data can not be recovered and replace, VMS > marks the block as invalid till it is written to. This is *essential* > for data integrity. I beg to differ. A drive that auto-reallocates data it can't read is not compliant with the SCSI 2 specification. See 9.3.3.6 Read-write error recovery page: "An automatic write reallocation enabled (AWRE) bit of one indicates that the target shall enable automatic reallocation to be performed during write operations. The automatic reallocation shall be performed only if the target has the valid data (e.g. original data in the buffer or recovered from the medium). The valid data shall be placed in the reallocated block. Error reporting as required by the error recovery bits (EER, PER, DTE, and DCR) shall be performed only after completion of the reallocation. The reallocation operation shall report any failures that occur. See the REASSIGN BLOCKS command (9.2.10) for error procedures. An AWRE bit of zero indicates that the target shall not perform automatic reallocation of defective data blocks during write operations. An automatic read reallocation enabled (ARRE) bit of one indicates that the target shall enable automatic reallocation of defective data blocks during read operations. All error recovery actions required by the error recovery bits (TB, EER, PER, DTE, and DCR) shall be executed. The automatic reallocation shall then be performed only if the target successfully recovers the data. The recovered data shall be placed in the reallocated block. Error reporting as required by the error recovery bits shall be performed only after completion of the reallocation. The reallocation process shall present any failures that occur. See the REASSIGN BLOCKS command (9.2.10) for error procedures. An ARRE bit of zero indicates that the target shall not perform automatic reallocation of defective data blocks during read operations." Any implication that standard practice is for drives to throw away or lose data if these bits are set should be greeted with the utmost skepticism, the same skepticism that should be offered whenever one hears most of DEC's claims about SCSI issues. Also, I'm almost pos- itive that VMS never issues partial-sector writes - a write is for one or more sectors, not fractions thereof. So a write that encoun- ters a bad sector by definition *never* cares about the data that was in the sector as it is going to be replaced in its entirety anyway. A read/modify/write is another matter, but as the name implies, there is a read at the beginning, so we're either talking about read re- allocation if the sector was bad to begin with, or a write reallo- cation with known good data in VMS buffer somewhere. Sure, there are broken implementations out there - but stuff like broken TCQ support is *far* more common, and most of the broken drives have either had new firmware available or are older, obsolete models. Certainly, most vendors would not ship generic drives with these bits set-by-default if they were aware of implementation defects that caused those drives to lose data during reallocations. Note that there is a separate, unrelated issue with defect handling - the original DEC implementation on SDI controllers complemented the sector checksum if the block contents were unknown following a reallo- cation from an uncorrectable read error. This is what sets the "forced error flag" when that sector is read, until it is overwritten with new data. At the time DEC started using SCSI disks, the only way to do this in a generic fashion was to use the "WRITE LONG" SCSI command, which not all [ancient] drives implemented. So there was a subset of drives which were known to DEC to work properly with DEC's method of keeping track of "good blocks with bad data". This was based *entirely* on DEC's reasonable plan to re-use the same defect management code as was used on SDI disks. Certainly there were more generic ways of keep- ing track of this which didn't rely on a not-always-present drive feature, such as keeping a table of these blocks on the disk somewhere. [Note that SDI/MSCP maintains a bunch of similar tables, since those disks didn't have special areas for thinks like the SCSI PLIST and GLIST.] However, as described this has nothing to do with AWRE or ARRE, but instead support for WRITE LONG (which, by the way, can't be probed with a MODE SENSE command - you have to try it and see if you get a check condition/command reject back). Terry Kennedy http://www.tmk.com terry@tmk.com Jersey City, NJ USA