From: Terry Kennedy [terry@gate.tmk.com]
Sent: Friday, June 08, 2001 3:55 AM
To: Info-VAX@Mvb.Saic.Com
Subject: Re: Low level format of SCSI disk, so VMS can read it.

Paul Repacholi <prep@prep.synonet.com> writes:
> Jim Agnew <Agnew@hsc.vcu.edu> writes:
>> i'm told later versions vms can handle this.
>
> Tolerate, with complaints would be nearer.

  I'm pretty sure that the driver was changed to soft-set the changeable
mode page if it found auto-reallocate set, not to ignore the fact that
it was set. I'm pretty sure Glenn Everhart did the change while at DEC,
but I'm not 100% positive of that.

> If the drive replaces the failed block, you have NO indication that
> the data is bogus. If the data can not be recovered and replace, VMS
> marks the block as invalid till it is written to. This is *essential*
> for data integrity.

  I beg to differ. A drive that auto-reallocates data it can't read is
not compliant with the SCSI 2 specification. See 9.3.3.6 Read-write error 
recovery page:

  "An automatic write reallocation enabled (AWRE) bit of one indicates
that the target shall enable automatic reallocation to be performed
during write operations. The automatic reallocation shall be performed
only if the target has the valid data (e.g. original data in the buffer
or recovered from the medium). The valid data shall be placed in the
reallocated block. Error reporting as required by the error recovery
bits (EER, PER, DTE, and DCR) shall be performed only after completion 
of the reallocation. The reallocation operation shall report any
failures that occur. See the REASSIGN BLOCKS command (9.2.10) for
error procedures.

An AWRE bit of zero indicates that the target shall not perform automatic 
reallocation of defective data blocks during write operations.

An automatic read reallocation enabled (ARRE) bit of one indicates that
the target shall enable automatic reallocation of defective data blocks
during read operations. All error recovery actions required by the error
recovery bits (TB, EER, PER, DTE, and DCR) shall be executed. The
automatic reallocation shall then be performed only if the target
successfully recovers the data. The recovered data shall be placed in
the reallocated block. Error reporting as required by the error recovery
bits shall be performed only after completion of the reallocation. The
reallocation process shall present any failures that occur. See the 
REASSIGN BLOCKS command (9.2.10) for error procedures.

An ARRE bit of zero indicates that the target shall not perform automatic 
reallocation of defective data blocks during read operations."

  Any implication that standard practice is for drives to throw away
or lose data if these bits are set should be greeted with the utmost 
skepticism, the same skepticism that should be offered whenever one
hears most of DEC's claims about SCSI issues. Also, I'm almost pos-
itive that VMS never issues partial-sector writes - a write is for
one or more sectors, not fractions thereof. So a write that encoun-
ters a bad sector by definition *never* cares about the data that was
in the sector as it is going to be replaced in its entirety anyway.
A read/modify/write is another matter, but as the name implies, there
is a read at the beginning, so we're either talking about read re-
allocation if the sector was bad to begin with, or a write reallo-
cation with known good data in VMS buffer somewhere.

  Sure, there are broken implementations out there - but stuff like
broken TCQ support is *far* more common, and most of the broken drives
have either had new firmware available or are older, obsolete models.
Certainly, most vendors would not ship generic drives with these bits
set-by-default if they were aware of implementation defects that
caused those drives to lose data during reallocations.

  Note that there is a separate, unrelated issue with defect handling -
the original DEC implementation on SDI controllers complemented the
sector checksum if the block contents were unknown following a reallo-
cation from an uncorrectable read error. This is what sets the "forced
error flag" when that sector is read, until it is overwritten with new
data.

  At the time DEC started using SCSI disks, the only way to do this
in a generic fashion was to use the "WRITE LONG" SCSI command, which
not all [ancient] drives implemented. So there was a subset of drives
which were known to DEC to work properly with DEC's method of keeping
track of "good blocks with bad data". This was based *entirely* on
DEC's reasonable plan to re-use the same defect management code as
was used on SDI disks. Certainly there were more generic ways of keep-
ing track of this which didn't rely on a not-always-present drive
feature, such as keeping a table of these blocks on the disk somewhere.
[Note that SDI/MSCP maintains a bunch of similar tables, since those
disks didn't have special areas for thinks like the SCSI PLIST and
GLIST.] However, as described this has nothing to do with AWRE or 
ARRE, but instead support for WRITE LONG (which, by the way, can't
be probed with a MODE SENSE command - you have to try it and see if
you get a check condition/command reject back).

        Terry Kennedy             http://www.tmk.com
        terry@tmk.com             Jersey City, NJ USA