<comment>
 $ DOCUMENT/CONTENTS  PP-sample.SDML REPORT PS
<endcomment>
<comment>
Intro/history
problem stmt (brief)
methods used (group history & procedures)
alternative chosen
schedule estimates
  - 6 month bullets
  - what IS or IS NOT doneness
  - completion date for HL doc
  - project plan (ie, the ~6 "tall pole" things to fix)
  - long term guesstimated resources/sched
resource estimates

what is NOT covered (ie, 7.1 code)
------------------------------------------
<endcomment>
 <FRONT_MATTER>
 <TITLE_PAGE>
<DEFINE_SYMBOL>(PROJECT_NAME\SCSI Subsystem Proactive Maintenance)

<TITLE>(Investigation Report for\
        <REFERENCE>(PROJECT_NAME) <line><line>  DIGITAL CONFIDENTIAL)
 <REVISION_INFO>(X0.1)
 <AUTHOR>(Glenn C. Everhart, PhD.)
 <DATE>
<ABSTRACT>(Abstract)
 This is the Investigation Report for <REFERENCE>(PROJECT_NAME).
 This investigation report is in:
 <p>
 EVMS::DOCD$:[EVMS.SCSI.REFLIB]pa_maint_invrpt.PS and .TXT
 <ENDABSTRACT>
<comment>
<HEAD2>(Review Team)
<p>
<SIGNATURES>
<byline>(Ken Munsell\Group Manager)
<endcomment>
 <ENDTITLE_PAGE>
 <COPYRIGHT_PAGE>
 <COPYRIGHT_DATE>(1995)
 <ENDCOPYRIGHT_PAGE>
 <CONTENTS_FILE>
<ENDFRONT_MATTER>
<RUNNING_TITLE>(Digital Equipment Corporation--Investigation Rpt\Digital Confidential)
<HEAD1>(Introduction)
<p>
The VMS SCSI subsystem originated as a simple system to support a
few devices on what was viewed as a low-end system. Gradually it has
grown over time as SCSI became popular beyond the wildest imaginations
of the first implementors, and been ported (bug for bug compatible) to
Alpha from VAX. The costs for maintaining the SCSI system have been
growing during this time, and after a particularly difficult system
enhancement, management has decided to commission some preventive
maintenance. The group held a retrospective and a problem definition
series of meetings and arrived at a problem statement summarized below.

<HEAD1>(PROBLEM STATEMENT)
<p>
The OpenVMS SCSI subsystem has become difficult to maintain, understand,
and extend. These problem areas, which are visible to internal users and
to customers, need to be simplified and improved. 
<p>
The problems in the system are:
<p>
There is no comprehensive OpenVMS SCSI design. Nothing much was ever
written, and the people who had some oral tradition about the top
level design have left. This has meant that system upgrades were made
with no top-level understanding or structure; the result is inconsistent
interfaces and information passing by side effects in many locales in
the code. Changes are fragile and the environment increasingly hard to
understand, since the only reference is generally the code itself, which
has lost some of whatever coherency it initially had.
<p>
This situation makes it hard to fix or enhance the SCSI code base. It
means it takes longer than it should to learn the system, so that new
people cannot quickly contribute. It breeds new bugs.
<p>
As a result, customers see features they want delayed or denied. They also
see VMS able to adapt only slowly to SCSI enhancements such as wide SCSI,
added LUNs, new device types, and commodity device support. Finally, third
parties have trouble getting information they need to add device support of 
their own.

<HEAD1>(Goals:)
<list>(unnumbered)
<le>Make the VMS SCSI subsystem easier to maintain
<le>Make the VMS SCSI subsystem easier to understand
<le>Make the VMS SCSI subsystem easier to extend
<le>Not breaking SCSI functionality
<le>Provide a path to implement
<le>Not break code written by other groups
<endlist>
<HEAD1>(Non-Goals:)
<list>(unnumbered)
<le>Rewrite code solely to make it "pretty"
<endlist>
<p>
<HEAD1>(Methods Used (and background in what has been done):)
<p>
The conclusion was drawn initially that a new architecture for
the SCSI system should be developed, in writing this time, based
on the ideas of the SCSI group. Initially this approach was not
constrained, and some fairly major modification was investigated.
<p>
One particular approach, investigated individually before the full
SCSI group became involved, was to port the DEC Unix CAM
implementation to VMS.  Such a port has some attractions, in that
it might make it possible to support devices with one driver for 2
OSs. While there is a great deal of superficial similarity in
approaches, the Unix CAM code would have had to have the entire
suite of VMS scheduling primitives added, and its internal
scheduling model would have needed revisitation also, since Unix
schedules processes, including driver parts, very differently from
VMS. It was believed that this effort would be lengthy and high
risk (owing to the need to port an unknown fraction of other Unix
internals to VMS), and the code so ported might still not be much
closer to what Unix uses than VMS drivers now are, so the gains
would probably not be realized. Thus this idea was abandoned.
<p>
Several sources of information were used to discover additional
possible improvements to SCSI besides inspection of code from VMS
and Unix. These included a sizable list of requests for new
functions from various group members (including some who have left
now but had a larger share of design lore than is available now), a
SCSI retrospective in which the entire group reviewed what had gone
well and badly in the Zeta production effort, discussions with
group members about what they believed would be useful, and a notes
conference. The SCSI ARCHITECTURE notes conference was set up in
early July and contains a written repository of input from the
entire SCSI group of ideas and concepts of how SCSI should be
implemented on VMS.  It has served to provide much of the content
for a very high level SCSI architecture document which incorporates
much of the work represented there. The SCSI retrospective also
produced a written report which outlines what the group found to
have been lacking and to have caused difficulties in the Zeta (VMS
6.2) development effort in SCSI (which introduced tagged command
queueing and a number of other upgrades to SCSI).
<p>
The architecture notes file effort occupied much of the SCSI team for
around 3 months as each member was asked to describe how SCSI should
work, in fairly unconstrained form, on VMS. In this effort, it was
specific that the current code base design was an acceptable answer
provided it was arrived at after consideration of possible alternatives
and provided the description arrived at was coherent. Responses varied
greatly in closeness to the Zeta code base design, but substantial
agreement was converged on in many design features through discussions
in the notes file, at meetings, and individually.
<p>
The results of the investigations and discussions mentioned above
constitute the background of the present investigation, and will serve
as resources in its accomplishment.
<p>
<HEAD1>(Approach:)

<HEAD2>(Rejected Approaches:)
<p>
The following approaches were considered by the full SCSI group, in the
light of the history mentioned above, and rejected:
<p>
The first approach to dealing with SCSI system maintainability problems
was to simply document the existing SCSI code base as is. This was
rejected because in the opinion of the SCSI group, the existing code base
lacks a consistent and comprehensive design to document. Some parts of
the base do have such designs, but there is no overall principle which
ensures they are mutually consistent. This precludes complete limitation
to the existing base. If a consistent design existed for the full current
code base, one could document it and improve maintainability and
extensibility in this way. However, documenting the current code base
would only codify the inconsistencies whose presence demonstrates that a
coherent top level picture cannot be drawn using only the current code
base. A coherent top level picture of the code base must go beyond what
is there now if only to eliminate inconsistencies.
<p>
The second approach is an opposite of the first, namely to rewrite the
entire SCSI subsystem to a new and unconstrained design. While the
process of creating such a thing can produce a complete document set and
drivers consistent with it, and would cause third party impact only once,
it provides no benefit for a long time. New functionality would be
delayed and the current code base would all have to be maintained as is
for the full development period, with no short term reduction in the
effort needed to do this. This approach also would be most vulnerable to
the continual flood of new SCSI implementations from vendor devices. To
the extent current code had to be patched to handle these, a from scratch
approach would accumulate a larger number of to-be-addressed issues which
would have to be handled before deployment than other approaches, which
involve less of a delay before first code.

<HEAD2>(Selected approach:)
<p>
The selected approach is a hybrid. It begins with a high level design
document which is broad but not deep in details, which has rules of thumb
for handling known SCSI issues, but little specific internals
information. This document represents a complete vision of what the best
way to implement SCSI on VMS is, subject to the constraint that it be
possible to implement incrementally. It must, finally, contain external
interface descriptions and data descriptions in a high-level form.
<p>
Once the high level document is defined sufficiently to proceed, a series
of follow-on investigations will be chosen, based on CLDs, QARs, and other
maintenance experience and based on the interface descriptions.
Approximately 6 such areas will be chosen. These areas will then be
subjects of subsidiary projects whose objectives will be to rework those
areas so they are consistent with the high level documentation and to
document the design of the areas reworked for the future. The scope of
this project will be deemed satisfied once the high level document has
definitions of interfaces and is released in initial form and when these
projects are also done, though it is expected that future SCSI work will
reflect, and be reflected in, the high level document. (Note: choice of
the projects can overlap the document completion, considering that much
work on a high level document has been done already.)
<p>
The initial selection of projects is described below. The list supplied
includes more items than are likely to be possible by 7.2 code freeze,
and is intended to show a set of to-be-done items the first several of
which may be possible to finish by then as well as to give some idea of
the items needing work over several releases.
<p>
The overall project result will be a SCSI documentation set covering high
level design principles, selected areas of SCSI which need work most, and
reworked code for the "worst problem" areas in the system (plus glue as
needed so that the rest of the SCSI subsystem will continue to work).
<p>
This approach has the advantages that all work will have a top level
design available. Some areas will be reworked for each new VMS release,
and customer impact will be localized to change areas, and made minimal
in any case due to the constraint that it be possible to use existing
code along with the new.
<p>
The major theoretical drawback to this approach is that the constraint on
high level design could preclude some conceptual breakthrough. Since the
group has sought such and not found it, though, the risk of this is
small. More practically, the SCSI code in this approach may never fully
match the design, though it will approach it, and the time to overhaul
every bit of the system may in principle be greater than a "clean sweep"
rework due to need to keep un-updated components working. (You don't need
to write glue code in a clean sweep approach.)
<p>
In practice, some parts of the SCSI system might never need to be
updated, so the "overall time" issue may be a red herring. At any rate,
feature enhancements and fixes are needed sooner than a clean sweep
approach could deliver them.

<HEAD2>(Specific Implementation Approach)
<p>
The high level document will be derived from the existing architecture
document with addition of greater detail about data structures and the
port-class interface and discussion of how to "glue" existing drivers
in with this design. This document proposes a set of rules of thumb
but is intended to permit incremental implementation of its ideas and
thus is suitable as a starting point. This document is expected to be
complete by the end of CY 1995. However, it is also expected to be
modified by subsequent implementation projects so that as parts of the
SCSI subsystem are reworked over time, they will approach a commonly
documented high level design (and the high level design document will
approach the code). To accomplish this, each project should attempt to
implement features of the high level document bearing on areas the
project covers, and the high level document should be adjusted when
problems with it are found.
<p>
The subsequent projects each need to follow the LOP cycle so that they
are reviewable individually; the choice list in this document is intended
to show what the long term group plan is. Because the project is intended
to produce a change in long-term development policy in the direction of
having a high level document which is kept consistent with code base
pieces as they are modified, it cannot be said to be "over" in same way a
code rewrite can. For purposes of discussion of a project, however, this
project can be said to have deliverables of a high level document and a
set of projects to begin addressing the most urgent issues identified
specifically. Once these are delivered, one can speak of THIS project as
"done". A SCSI document set cannot be said to be complete and consistent
until at least example code exists at all levels of the high level design
and either the entire system conforms to a single design or at least glue
code exists where needed so that components deemed "end of life for
maintenance" can continue to be used with the rest of the system.  The
fact that this project involves a commitment and intention to produce
such documentation as an ongoing part of future development does not
imply a perpetual project. Rather, it will provide a source for future
projects, but this project itself should be considered to terminate with
delivery of its high level document (in its initial state), and with
delivery of the first group of implementation projects.  Further projects
are expected to be proposed on an ongoing basis, but these will be
considered in a timeframe past the VMS 7.2 period.
<p>
The project choices made at this time are weighted most heavily by what
will address CLDs, QARs, and needs of other VMS components, but in the
future, documentation of project designs, and ensuring conformance
between those designs and the high level documents, must be a part of
each project, with a design document as a deliverable from each project
that produces code. Each project should make some contribution to
making the SCSI system more maintainable as well as possibly add new
functionality, and projects further out can be derived from what is
necessary to implement an entire execution suite to the high level
design, from class driver, through layers of common code, to adapter
specific port code. The basic rule is that when a component is changed,
it is changed consistently with the overall architecture (both are
adjusted as more is learned) so that after several releases, it is expected
that most of the system will have been revised, and thus will conform,
without the need to do rewrites solely to bring about conformity.
<p>
Using their knowledge and the existing architecture document, plus review
of the QAR and CLD databases and of recent fixes in code, as inputs
the group has come up with a set of specific projects which attempt to
provide specific implementation actions which will best improve the
maintainability and extensibility of SCSI on VMS and reduce the maintenance
burden on the group.
<p>
It is a goal in all the following projects to consult other groups
affected to ensure no problems are caused.
<p>
This list includes the following projects, in the group's consensus priority
	order:

<list>(numbered)
<le> "Extended SCSI Address Space"
<p>
   Modify driver structures and code to support big SCSI IDs. Since
	it makes little sense to change data structures several times,
	incorporate data structure cleanup here also, documenting
	SCSI data structures consistent with high level doc & 
	defining access rules for at least the most frequently 
	used fields. (It should be added that if we have to edit
	data structures for some other reason first, we need to do
	the cleanup then, though code to use all new areas added may
	not be done for a while.)
<p>
	This project is intended to support SCSI IDs 0 to 15 on wide
	busses, and LUNs 0-31. For serial or fibre channel SCSI, larger
	IDs and LUNs as large as needed to cover the address range of
	these busses will be supported.
<p>
Duration: TBD

<le> "SCSI Feature Control"
<p>
   Add an external control interface to permit outside control of the
	use of SCSI features by (mostly 3rd-party) devices. Among the
	work is:
<list>(unnumbered)
<le> Control synchronous and fast SCSI use (and ensure drivers can
		support fast SCSI properly)
<le> Control wide SCSI and ensure drivers can use this feature if
		present and enabled.
<le> Control tagged command queueing use
<le> Control timeout values per device
<le> Control use of 10-byte modesense messages
<le> Implement and add controls for diagnostic ring buffer code to
		capture SCSI information when enabled to do so.
<endlist>
	Eventually this interface would be able to use the registry
	facility now in design to control these features of boot devices
	also.
<p>
Duration: 3 man-weeks to translate existing control
	interface into C.
<p>
	  Rest: TBD
<p>
	This feature control will permit much easier handling of SCSI 
	devices from commodity sources and permit many QAR or CLD
	issues to be handled by customers running a configuration
	utility, rather than needing engineering time to develop new
	drivers.

<le> "SCSI Doc set"
   This project is designed to improve documentation available for the
	current SCSI implementation and to ensure future documentation
	is present and useful.
<list>(unnumbered)
<le> Have driver maintainers write up an intro/internals document
	describing how the driver works. 
<le> What documents are needed & what release each is for needs to
	be planned; this involves 2 sets of manuals:
<list>(numbered)
<le> A quick set (the cheat sheets about current drivers)
<le> A slow set (full design docs for drivers using perhaps
		a template design document from the documentation
		people.)
<endlist>
<p>
Duration: TBD

<endlist>
<le> "Enhanced Diagnostic Features & Tools"
   To make it easier and faster to diagnose errors and system error
	states, add the following features:
<list>(unnumbered)
<le> Make error log entries informative & consistent (including
		unique type/subtype)
<le> Make bugchecks unique so one can find where they come from
<le> Add diagnose interface to port drivers (functionally. If this
		can be done most efficiently by implementing class driver
		disconnect and using GKdriver this can be done that way.)
<endlist>
<p>
Duration: Estimated 890-1140 lines of code in 11 modules (in Macro,
	Bliss, and C, mostly Macro). Guessimate ~1 month.

<le> "Common routines"
   To simplify the SCSI code base in some ways, create common routines to:
<list>(unnumbered)
<le> Construct SCSI commands (so class drivers don't have each
		to "know" how)
<le> Handle memory management functions to do address and mapping
		translations drivers need in a standard and central way
		(rather than separately in each driver).
<endlist>
<p>
Duration: TBD

<le> "Utility Application"
<p>
   Write a utility to issue SCSI commands, collect ring buffer info,
	output from commands, etc. Depends on IO$_DIAGNOSE for
	function.
<p>
Duration: TBD

<le> "Target Mode"
<p>
   Implement target mode/AEN code needed for clusters in other port drivers
	as needed. Implement a complete target mode either using a new
	class driver or keeping things in port driver common code. This
	supports SCSI clusters now, but may be needed for communications
	or other functions in the future.
<p>
Duration: TBD

<le> "Flow Issues"
    Make the queue manager optional so ports not needing it will not
	get it. This will make considering flow control necessary and
	should be the occasion for adding externally controllable flow
	control in if not done already. While handling these issues,
	remaining known problems with reset and flow control issues for
	power management should be addressed.
<p>
	Such functions will make tuning I/O on shared SCSI busses simpler
	as well as speed up I/O where the software queue manager is not
	needed, on several SCSI adapters.
<p>
Duration: TBD

<le> "Restructure Drivers"
    Restructure drivers to:
<list>(unnumbered)
<le> Have more common code
<le> Isolate unique device support cases better
<le> Adapt as appropriate to device SCSI capabilities (discussed more
		fully in the high level design document)
<endlist>
<p>
Duration: TBD

<le> "Gentler Error Recovery"
<p>
   Implement command cancel via Abort Tag, Abort, Bus Device Reset,
	Bus Reset stopping at the first success. Also allow I/O
	cancel to cancel operations more promptly, especially
	long commands. This may involve a rework of flow control
	and so should address adding some externally specified
	quotas, runtime tunable, to better regulate flow if so.
	(Whether this happens depends on whether another project has
	reworked flow control to allow external controls first.)
<p>
	Such handling will provide less disruption on SCSI clusters
	and is essential if shared access to non disk devices is to
	be supported (as opposed to served access).
<p>
Duration: TBD

<le> "Larger I/Os"
    Remove port limitations so that I/O requests larger than 64K can
	be enabled if ports support them. Add control via control
	interface to allow this. (This is useful for firmware loading
	etc., but may impact timeouts and the like if default is to
	allow very large transfers. Thus make it an exception.)
<p>
Duration: TBD
<endlist>
<p>
It should be noted that as of this writing it is expected the first
few of these projects may be doable by VMS 7.2 code freeze. Beyond
those, it is also expected that the list will be reviewed. Hence time
estimates for the latter part of the list are not included here, since
experience will probably cause other proposals to surface and give rise
to change in the ordering. The longer list is presented here since it
can be said to cover the requirements of the high level document in the
sense that if all these projects are completed, at least one wholly
compliant execution path from class driver, through common code, to port
driver adapter code will exist. 
<p>
The high level design document's prescriptions are intended to accomplish
business purposes, though, as are each of the foregoing projects, and
work on each particular area of the system must pass muster as the most
valuable work possible at the time it is done, consistent with VMS goals
and resources at that time.


<HEAD1>(Schedule)

<list>(unnumbered)
<le>High level document available for reference but incomplete in
	detail:	8/31/1995
<le>High level document complete: 12/31/1995
<le>Subproject initial list complete: 11/12/1995
<le>Subproject LOP investigations begun for first projects: 11/20/1995
<le>Subproject coding begun: 2/2/1996
<le>Some Subprojects completed: 7/1996
<endlist>

<HEAD1>(Resources Available)
<p>
Resources available for writing these documents are considerable. There
exist design documents for parts of some of the drivers which, while out
of date and incomplete reflect parts of the designs. Also a top level
design document exists and much detail was assembled in connection with
deriving a new SCSI architecture. While it has not described interfaces
or data structures in detail, it does describe some general features and
satisfies the constraint on incremental implementability. Group members
have also listed numbers of areas needing improvement in the past. Since
there is considerable overlap in these problem areas, selection of a few
top candidates should be straightforward. The entire project should be
possible to get to beginning its subordinate projects by the end of 1995
so that some new code can be in place for the projected VMS 7.2 release.
<p>
There are also considerable code resources available, in the form of
already existing control interface code, already-designed driver control
areas to allow external control of SCSI features, designs for some of
the wide-address problem, and of course a code base which does function
correctly in most circumstances and which is well commented.
<p>
As to human resources, it cannot be assumed the entire SCSI group is
available, since QARs and CLDs need to be handled on an ongoing basis.
Therefore the probable number of engineers available for these projects
is probably in the 2 to 3 range over the period in question.


<HEAD1>(Next Steps)
<p>
A more detailed project plan will follow to cover time estimates, and
LOP plans for the chosen subprojects will be begun by the beginning of
1996, in order that finer details on time and resource requirements can
be ascertained. These estimates may dictate changing the order of
implementation, but in any case execution of subprojects should begin by
early 1996, with the intent of having several completed by VMS 7.2 code
freeze.