<<< EVMS::DOCD$:[NOTES$LIBRARY]SCSI_ARCHITECTURE.NOTE;1 >>> -< SCSI ARCHITECTURE >- ================================================================================ Note 53.0 Insert Your Problem Statements Here 5 replies STAR::TGOODWIN 34 lines 17-OCT-1995 14:39:58.36 -------------------------------------------------------------------------------- The current SCSI implementation, as delivered in the V6.2 release on Alpha, suffers from the following problems: - There are no written specification or design documents for any of the port or class drivers. - The interface design documents for the following interfaces are incomplete or out of date: SCSI Class Driver to SCSI Port Driver Shadowing to Disk Class Driver MSCP to Disk Class Driver TMSCP to Tape Class Driver - Most of the class driver code and common port code (SCSI2COMMON) was ported from the undocumented VAX implementation and does not make efficient use of the Alpha architecture or the newer port adapters. - Many of the current SCSI data structures cannot support the growth of SCSI which is currently being proposed (i.e. IDs > 8, LUNs > 8, bus width = 16 (or 32), serial SCSI). - At times, the insertion of bug fixes and new functionality into the port and class drivers has been made without careful consideration for the comments documenting the effected areas. This has resulted in making some sections of the code incomprehensible due to discrepencies between code and comments. All of these factors have resulted in making the current SCSI implementation an unmaintainable and inextensible basis for future SCSI development. Just my $0.02 ================================================================================ Note 53.1 Insert Your Problem Statements Here 1 of 5 EVMS::RLORD 84 lines 17-OCT-1995 15:40:56.74 -< YAPS (Yet Another Problem Statement) >- -------------------------------------------------------------------------------- 17-Oct-1995 Tuesday What problem are we trying to solve? The current SCSI implementation is difficult to maintain and enhance, making it expensive and time-consuming to troubleshoot and repair. It also makes it risky to introduce new functionality which might break something else that seems to be working OK. I don't want to say that the code itself is basically unreliable because I don't think that it's very widely used yet. I know that we had a rush of CLDs recently, but for all I know they're all the result of one or two problems. It's a moot point anyway - even if the code was very reliable, the fact that it's hard to maintain and enhance would require that we do something about it. What makes the implementation difficult to maintain and enhance? There is no documentation for the SCSI subsystem outside of the driver code itself. Knowledge of specific pieces of the subsystem is unevenly distributed across individuals in the group and otherwise obtained only by slogging through the code. This means that some problems can really only be efficiently debugged by one or two individuals - worse, it means that no one individual in the group can efficiently troubleshoot a very wide variety of problems. Not only are there no design or functional specifications available, it's doubtful that there ever was a single comprehensive design, even in someone's head - certainly not one which addressed those issues listed below which were current at the time. Because of this, functional blocks (for instance, SDTR) were designed and implemented individually, without a good understanding of the bigger environment in which they had to work, and likely without wide review. It's not so much a matter of understanding one design as it is understanding a bunch of little designs that don't seem to get in each others way too much, too often. The pre-V6.2 code itself contains quite a bit of poorly commented, uncommented or inaccurately commented code; in some cases we're a bit afraid to touch it because we're not really sure what it might be doing. The V6.2 code which has IPL, lock and thread info in the comments is hideous but very useful (I notice it when I go to a driver that doesn't have it). The pre-checkin reviews the team did really did enforce better line and block comments than exists in the old code. If you doubt this, take a stroll through the VAX drivers. What issues should the SCSI subsystem address? Possible investigation or mini-investigation projects: 1) Wide IDs and LUNs 2) Maintenance interface (IO$_DIAGNOSE) 3) Mount Verification Requirements 4) Shadowing Requirements 5) MSCP Requirements 6) Control Interface (IO$_SExMODE) 7) SDTR & Fast SCSI 8) WDTR (16 or 32 bits) 9) Bus Resets 10) Disconnects 11) Reselection 12) SCSI commands 13) SCSI messages 14) SCSI bus conditions (ATN, reset, contingent alliegience) 15) CHECK CONDITION / REQUEST SENSE 16) AEN 17) Target Mode 18) Power Management 19) Cancel I/O 20) Error Logging 21) Flow Control 22) Memory Management 23) Skip Files 24) Device-Specific (Device-Class+Device-Type+Firmware) Requirements 25) Driver Diagnostics 26) Utility Applications 27) Mode Sense ================================================================================ Note 53.2 Insert Your Problem Statements Here 2 of 5 STAR::S_SOMMER 27 lines 17-OCT-1995 15:47:19.22 -------------------------------------------------------------------------------- Problem statement: The Alpha OpenVMS SCSI subsystem is not as robust, effective, nor maintainable as it might be. Specifically, its current shortcomings are: o There is no comprehensive documentation describing the design of the system. o Some routines lack modularity. o Some data structures have fields whose ownership and accessibility are unclear. o Some mainline paths were originally designed to accommodate what is now end-of-life hardware. o Third party devices are vulnerable to regression after upgrades to new releases. o There is a backlog of problem reports which have no integration plan with regard to future releases. o There is no current party line on the level of support for third party SCSI drivers. In particular there is no official SCSI documentation and no party line on the stability of routine interfaces and data structures. ================================================================================ Note 53.3 Insert Your Problem Statements Here 3 of 5 EVMS::EVERHART 87 lines 17-OCT-1995 16:11:03.09 -< My piece... >- -------------------------------------------------------------------------------- The following is my take on the problem to be solved. The VMS SCSI code base has evolved largely without written documentation and without written principles of organization. In addition, those individuals who had been involved with it from the beginning have largely left the group. In the meantime, many new SCSI devices have appeared and new features and functions have been needed in the SCSI code base which thus have been implemented with no access to a broad vision of what the complete I/O path through the SCSI subsystem should be. This has led to a "patch" style of maintenance in many cases, where unique functions are added in localized modifications, sometimes inside conditional tests, and in which schedule constraints have at times dictated leaving senseless code in the code base because of the risk of removing it shortly before a release. As a result of this history, the code base has been getting progressively less orderly and harder to maintain, and the training period needed for anyone coming into the SCSI group is becoming uncomfortably long, and must require weeks or more of code reading, since no better documentation of the current architecture is available. With the advent of still more SCSI devices and signs of a breakdown of the original assumptions that SCSI need not address anything other than some tapes and some disks, together with other uses of SCSI, the difficulty of maintaining SCSI code and of getting new people "up to speed" with the existing code has become an obstacle to adding desired new functions to VMS. For this reason, it has become critical that a SCSI metadesign and finally a complete SCSI design be committed to documentation and construction (incrementally, metadesign first, construction to its standards second) . The design must be: * Clearly understandable at least in broad terms * Clean * Provided with ways to address device peculiarities and complexities imposed by devices customers have or want * Able to provide good I/O performance * Able to handle the address ranges of future SCSI standards * Able to support simpler problem diagnosing * Able to support "future" VMS needs (e.g. SCSI clustering, failover, wide devices * Able to account for its decisions. The last point is important. A metadesign will consist of a set of principles and/or rules of thumb which will bear on all aspects of SCSI code, but the reasons why it specifies one choice and not another will be vital in allowing it to be modified in the future to cover situations not apparent today. There are two possible approaches to writing a very high level metadesign which may serve to organize cleanup or rewrite efforts: 1. Describe the existing code base and suggest possible mods, or 2. Describe what SCSI ought to look like, not limiting oneself to the existing code base. So long as it is clearly understood that the existing design is a candidate for what SCSI ought to look like, the latter approach seems most inclusive of new ideas. Given that an incremental implementation is a practical necessity the results probably will not differ much, but approach 2 avoids the temptation to do fine detail examination of parts of the current code base and miss its broader features. A series of rules of thumb against which to gauge code implementations requires such a view. What should result in the end is a documented subsystem. The overall high level features need to be laid out first because there must be principles in place to allow us to resist temptations to do patch-style customizations for unique features and to help lay out where major functions within SCSI (error handling, parameter checking, function determination, etc.) are to be located. This design document MUST BE MUTABLE, however, and be kept up to date as elements of the system are designed. This also means that the high level documenting stops as a single activity when major features of interfaces are laid out. (This means input/output/function for port/class interface, port common to adapter, and class common to class specific interfaces at minimum, though in very general terms only so that function names are chosen and coarse structures laid out, but not all details get filled in.) Rework of drivers or the like will necessarily refer to existing ones to ensure all functions are handled, but MUST be accompanied by documentation, both to update the upper level document and to describe what is written. (Yes, this is rather more than just a problem stmt...sometimes it's hard to stop writing when you're on a roll :-) ) Glenn Everhart ================================================================================ Note 53.4 Insert Your Problem Statements Here 4 of 5 STAR::FAIRBANKS 32 lines 19-OCT-1995 09:06:48.67 -< Problem statement >- -------------------------------------------------------------------------------- Problem statement: - We need documentation/specs on the current SCSI implementation. - There isn't any document describing the current design of the system. - Need a spec describing all interfaces between layers (class/port, etc). - Need a spec describing data structure cells and who has read/write access. With adequate documentation, we can then address: - Extensions for wide IDs, LUNs - Extensions for SCSI3 features. - Extensions to accomodate Fibre Channel, SSA, Serial SCSI - If a full target mode implementation will ever be done, we need the hooks to add Target mode enhancements - The need to have people understand other parts of the SCSI subsystem other than the piece that they developed. - Any bugs in the IO$_DIAGNOSE interface - How does 3rd party boot affect us? Dave ================================================================================ Note 53.5 Insert Your Problem Statements Here 5 of 5 STAR::RITTER 12 lines 19-OCT-1995 09:23:38.46 -< Short problem statement >- -------------------------------------------------------------------------------- Problem Statement: The current code base is not maintainable. This is due to the following causes: - No documented architecture - Incomplete Specifications - No design documentation - Considerable modifications made to code base without the above support documentation.