From: Glenn C. Everhart
To: Lenny Szubowicz
Re: IRP structure
Date: 24-March-1997

Problem:
  In adding services to VMS I/O (in my case, for multipath) via I/O
intercepts, it is necessary to modify some IRP entries during the
intercept period, and replace them so that the IRP can be post processed
in the same context as it was received.

  The traditional way to do this has been either to add per application fields
to the IRP definition (which has resulted in a very large IRP) or to
allocate a structure in pool and somehow find it from the IRP.

  Pool allocation can be slow, will certainly tend to fragment pool, and
growing the IRP with unique fields is really feasible only at .0
releases of VMS.

  However, speed and complexity of doing interception drop noticeably if
a "context stack" is available within the IRP itself. I would like to
allocate this and a few other fields at the end of the IRP definition for
use generally where available.

Goals:

* Allow fastest feasible add-in of processing of I/O.
* Function correctly when old length IRPs are seen
* Function correctly where IRPs are copied up to the "old" length
	from the longer IRPs

Alternatives:

There seems to be no alternative, if anything is added to the IRP, but
to have sufficient checking information to allow one to determine when
the IRP truly has the information and when it has not. Moreover, a
finite stack may overflow and provision must exist for handling
that overflow, as well as finding "old style" IRPs, by allocating
structures in pool.

Any scheme to store new information in fields not currently used within
IRPs risks some application not knowing about the fields and even
using them itself. The IRP$L_EXTEND field could be used to add an
IRPE, of course, but since postprocessing must occur in layers and
IRPEs are used for many other things, finding the "right" IRPE seems
to involve a search necessarily. Using some other list and an access
method for finding "this" IRP's data on it seems cleaner and less likely
to cause problems for other software that may use the IRPE list.

Thus the "fallback" action can as well store context in a separate
structure not required to be pointed at directly by the IRP.

Design:

Consider adding the following to the current IRP:

 1. A single bit status for the IRP, irp$v_gotstk, indicating the IRP
	has a stack area.
 2. The irp$w_size field will be larger than old IRPs
 3. The following new areas will be added after the current end of the
	IRP:

	IRP$L_CURCSP		Address of the current IRP stack frame
	IRP$L_STKFLGS		Flags longword. Defined at first is
				irp$v_onstack, indicating the context
				data is on the stack. On stack overflow
				this field is cleared.

	IRP$A_CTXSTK		Area to hold the context stack. The field
				IRP$L_CURCSP must point into this IRP's
				IRP$A_CTXSTK area for validity, and should
				be able to hold several context sets. The
				context area is used by multipath, and twice
				again for snap capable disk. The context area
				used for multipath is 8 longwords in
				length, holding the values of IRP$L_CURCSP
				and of IRP$L_STKFLGS at entry as well as
				save areas for ucb, media, PID, and stat
				fields. Thus a size of perhaps 50 longwords
				would seem advisable.

How To Test For A New Format IRP:

An IRP which has the irp$v_gotstk field set, is long enough (irp$w_size
big enough to hold the stack area), and whose IRP$L_CURCSP field points
into the area IRP$A_CTXSTK for this IRP is deemed to be a valid "new"
one, and otherwise is not.



What Happens with Old IRPs:

Note that an old length IRP will fail this test. Also an IRP which is
copied from a new IRP will also fail (CURCSP not in CTXSTK) unless of
course the copy updates the context stack, which would require knowledge
of the new fields. Therefore context information in the IRP would be
used only where it is valid.

The only practice which could get into trouble would be if someone
allocated an old sized IRP using a constant size, then copied the
IRP from a new IRP including copying the 12 byte header. This would
cause the size to be incorrect. While the copy would still not be
treated as a new IRP (the copy's stack pointer would be invalid),
the size deallocated would be different from the allocated size.

This kind of copy is however a highly suspect practice and should be
stamped out if it exists in some third party code. I do not believe
it is in fact used. The more common practice would be to allocate
up to some constant size and copy up to that size. That would fail
to be mistaken also.

If IRPs created by fast-io or by sysqioreq have the new format, most
IRPs in the system will have the extra storage needed.

Usage:

When an IRP is encountered that is in the new format (per the tests
above), if there is room on the stack for the context one saves, it
is saved there and the curcsp field is updated to point to the next
free location. When context is used, the old curcsp field gets
reloaded.

Should the stack be missing or not enough space left to hold a full
context, the irp$v_onstack flag is cleared (if present) and the curcsp
field is left alone (if present). Another structure is allocated in pool
to store context and linked to a queue maintained by the intercept (or
a hash block is allocated and used) and the context is stored there.
This hash block or queue element must have the IRP address so that the
correct one can be found at post processing time. The context block must
also have the prior state of the flags so that when postprocessing is done
on the IRP, the flags can be reset. That way, if the stack overflowed, the
old flags will indicate that the next context down is on the IRP context
stack.

At post processing when the context information must be restored to
post process the IRP in the correct device context, the context
location is determined as it is when the information is to be stored
and is reloaded into the IRP prior to completing its posting. If an
auxiliary structure was allocated, it must be freed at post processing
time too, once its data is reloaded. 

With the new code being added to allow post processing to be handled at
IPL 8 by intercepts, this gives a low overhead way to process I/O
interception and add new processing without much change to any
drivers. 

Any code that produces old style IRPs will find that they also work, and
are not interfered with by this logic. Any cloned or copied IRPs will
work as before with the "auxiliary structure" path of this logic. Most
IRPs will however be able to use the IRP itself, with a few tess added
which involve no PALcode calls. Only code that corrupts the IRP header
could get into trouble, and that is a tiny risk due to a basically
broken coding practice anyway.

Conclusion:

This scheme is low in risk, and provides significant performance and
orderliness benefits to future I/O level intercepts including those
wanted by EDO. It will not break existing drivers or code, but will
allow new intercepts to run faster than otherwise possible and will
avoid need in the future for yet more dedicated IRP cells.