WASD Hypertext Services - Technical Overview

[next] [previous] [contents] [full-page]

12 - Scripting

This chapter is not a tutorial on authoring CGI scripts. There exists a plague of references in the popular computing press covering aspects of this technology, usually quite comprehensive. This chapter merely outlines the WASD implementation details, which are in general very much vanilla CGI.

Scripts are mechanisms for creating simple HTTP services, sending data to (and sometimes receiving data from) a client, extending the capabilities of the basic HTTPd. Anything that can write to SYS$OUTPUT can be used to generate script output. A DCL procedure or an executable can be the basis for a script. Simply TYPE-ing a file can be provide script output. Scripts execute in processes separate from the actual HTTP server but under it's control and interacting with it.

NOTE: WASD can manage a script's process environment either as a subprocess spawned by the HTTP server, or as a network process created using DECnet. By default it supports subprocess-based CGI scripts without further configuration. If DECnet-based CGI scripting or OSU (DECthreads) emulated scripting is desired see 12.8 - DECnet Scripting.

Scripts are enabled using the exec or script rules in the mapping file (see 8 - Mapping Rules). The script portion of the result must be a URL equivalent of the physical VMS procedure or executable specification.

12.1 - Caution!

Scripts are executed within unprivileged subprocesses spawned by the HTTP server. These subprocesses are owned by the HTTP server account (HTTP$SERVER). Script actions can potentially affect server behaviour. For example it is possible for a script to issue an "HTTPD/DO=EXIT=NOW" command, or to create or modify logical name values in the JOB table (e.g. change the value of LNM$FILE_DEV altering the logical search path). Obviously these types of actions are undesirable. In addition scripts can access any WORLD-readable and modify any WORLD-writable resource in the system/cluster, opening a window for information leakage or mischievous/malicious actions (some might argue that anyone with important WORLD-accessable resources on their system deserves all that happens to them - but we know they're out there :^) Script authors should be aware of any potential side-effects of their scripts and Web administrators vigilant against possible malicious behaviours of scripts they do not author.

As of version 4.2 it has become possible to exercise some control over the privileges of spawned subprocesses, allowing enviroments that require scripts to have minimum privileges (e.g. NETMBX, TMPMBX for IPC) to provide them using the server account's authorized privileges. See 6 - Server Configuration.

12.2 - Scripting Environment

WASD HTTPd scripting underwent a major redesign between v4.1 and v4.2. This was to provide a faster and more efficient scripting environment. It provided the opportunity for a much needed review of the DCL mechanism within the server. As a result two capabilities not found in earlier versions became available, persistant subprocesses (see below) and CGIplus (see 12.7 - CGIplus Scripting).

Process creation under the VMS operating system is notoriously slow and expensive. This is an inescapable overhead when scripting via child processes. An obvious strategy is to avoid, at least as much as possible, the creation of subprocesses. The only way to do this is to share subprocesses between multiple scripts/requests, addressing the attendant complications of isolating potential interactions between requests. These could occur through changes made by any script to the subprocess' enviroment. For VMS this involves symbol and logical name creation, and files opened at the DCL level. In reality few scripts need to make logical name changes and symbols are easily removed between uses. DCL-opened files are a little more problematic, but again, in reality most scripts doing file manipulation will be images.

A reasonable assumption is that for almost all environments scripts can quite safely share subprocesses with great benefit to response latency and system impact (see 14.2 - Subprocess-based Scripting for a table with some comparative performances). If the local environment requires absolute script isolation for some reason then this subprocess-persistance may easily be disabled with a consequent trade-off on performance.

NOTE: With the form of subprocess management used in v4.2 and following, BYTLM can become an issue. When setting the HTTPd account BYTLM quota allow approxiamtely 12,500 bytes per subprocess that can be concurrently active, plus a general allowance (technically, allow 1.0 x /NETBUF= plus 1.0 x + 0.5 x + 0.5 x /SUBBUF=). That is if the subprocess hard-limit (see below) is 20 then BYTLM should be set to at least 250,000 plus 50,000. Of course in such a case PRCLM should be set to at least 20, preferably 40. These and other relevant quotas may be monitored using the HTTPDMON utility or the server administration menu.

Zombies

The term zombie is used to describe subprocesses when persisting between uses (the reason should be obvious, they are neither "alive" (processing a request) nor are they "dead" (deleted) :^) Zombie subprocesses have a finite time to exist (non-life-time?) before they are automatically purged from the system (see 6 - Server Configuration). This keeps process clutter on the system to a minimum.

12.3 - Script Run-Time

Scripts are merely executed or interpreted files. Although by default VMS executables and DCL procedures can be used as scripts, other run-time environments may also be configured. For example, scripts written for the Perl language may be transparently given to the Perl interpreter in a script subprocess. This type of script activation is based on a unique file type (extension following the file name), for the Perl example this is most commonly ".PL", or sometimes ".CGI". Both of these may be configured to automatically invoke the site's Perl interpreter, or any other run-time environment for that matter.

This configuration is performed using the [DclScriptRunTime] parameter, where a file type is associated with a run-time environment. This parameter takes two components, the file extension and the run-time verb. The verb may be specified as a simple, globally-accessable verb (e.g. one embedded in the CLI tables), or in the format to construct a foreign-verb, providing reasonable versatility. Run-time parameters may also be appended to the verb if desired. The server ensures the verb is foreign-assigned if necessary, then used on a command line with the script file name as the final parameter to it.

The following is an example showing a Perl run-time environment being specified. The first line assumes the "Perl" verb is globally accessable on the system (e.g. perhaps provided by the DCL$PATH logical) while the second (for the sake of illustration) shows the same Perl interpreter being configured for a different file type using the foreign verb syntax.

  [DclScriptRunTime]
  .PL PERL
  .CGI $PERL_EXE:PERL

A file contain a Perl script then may be activated merely by specifying a path such as the following

  /cgi-bin/example.pl

To add any required parameters just append them to the verb specified.

  [DclScriptRunTime]
  .XYZ XYZ_INTERPRETER -vms -verbose -etc
  .XYZ $XYZ_EXE:XYZ_INTERPRETER /vms /verbose /etc

If a more complex run-time environment is required it may be necessary to wrap the script's execution in a DCL procedure.

Script File Extensions

The WASD server does not require a file type (extension) to be explicitly provided when activating a script. This can help hide the implementation detail of any script. If the script path does not contain a file type the server searches the script location for a file with one of the known file types, first ".COM" for a DCL procedure, then ".EXE" for an executable, then any file types specified using script run-time configuration directive, in the order specified.

For instance, the script activated in the Perl example above could have been specified as below and (provided there was no "EXAMPLE.COM" or "EXAMPLE.EXE" in the search) the same script would have been executed.

  /cgi-bin/example

12.4 - CGI Compliance

The HTTPd scripting mechanism is designed to be WWW CGI (Common Gateway Interface) compliant, based in part on by the INTERNET-DRAFT authored by D.Robinson (drtr@ast.cam.ac.uk), 8 January 1996.

CGI Compliant Variables

Environment variables are created in a similar way to the CERN VMS HTTPd implementation, where CGI environment variables are provided to the script via DCL global symbols. Each CGI variable symbol name is prefixed with "WWW_" (by default, although this can be changed using the "/CGI_PREFIX" qualifier, see 5.3 - HTTPd Command Line, this is not recommended if the WASD VMS scripts are to be used, as they expect CGI variable symbols to be prefixed in this manner).

Extensions to CGI Variables

In line with other CGI implemenations, additional, non-compliant variables are provided to ease CGI interfacing. These provide the various components of the query string. A keyword query string and a form query string are parsed into separated variables, named

  WWW_KEY_number
  WWW_KEY_COUNT
  WWW_FORM_form-element-name

See the example below.

CGI Variable Capacity

DCL symbol values are limited to approximately 1024 characters. The CGI interface will provide symbols with values up to that limit if required. This should be sufficient for most circumstances.

CGI Variable Descriptions

Remember, all variables are prefixed by "WWW_".

	Description	"Standard" CGI
AUTH_GROUP	authentication group (or empty)	no
AUTH_REALM	authentication realm (or empty)	no
AUTH_TYPE	authentication type (BASIC or DIGEST)	yes
CONTENT_LENGTH	"Content-Length:" from request header	yes
CONTENT_TYPE	"Content-Type:" from request header	yes
FORM_field	query string "&" separated form elements	no
GATEWAY_INTERFACE	"CGI/1.1"	yes
HTTP_ACCEPT	any list of browser-accepted content types	optional
HTTP_ACCEPT_CHARSET	any list of browser-accepted character sets	optional
HTTP_ACCEPT_LANGUAGE	any list of browser-accepted languages	optional
HTTP_AUTHORIZATION	any from request header	optional
HTTP_COOKIE	any cookie sent by the client	optional
HTTP_FORWARDED	any proxy/gateway hosts that forwarded the request	optional
HTTP_HOST	host and port request was sent to	optional
HTTP_IF_MODIFIED_SINCE	any last modified GMT time string	optional
HTTP_PRAGMA	any pragma directive of request header	optional
HTTP_REFERER	any source document URL for this request	optional
HTTP_USER_AGENT	client/browser identification string	optional
KEY_n	query string "+" separated elements	no
KEY_COUNT	number of "+" separated elements	no
PATH_INFO	virtual path of data requested in URL	yes
PATH_TRANSLATED	VMS file path of data requested in URL	yes
QUERY_STRING	un-URL-decoded string following "?" in URL	yes
REMOTE_ADDR	IP host address of HTTP client	yes
REMOTE_HOST	IP host name of HTTP client	yes
REMOTE_USER	authenticated remote user name (or empty)	yes
REQUEST_METHOD	"GET", "PUT", etc.	yes
REQUEST_SCHEME	"http:" or "https:"	no
REQUEST_TIME_GMT	GMT time request received	no
REQUEST_TIME_LOCAL	Local time request received	no
SCRIPT_NAME	name of script being executed (e.g. "/query")	yes
SERVER_GMT	offset from GMT (e.g. "+09:30")	no
SERVER_NAME	IP host name of server system	yes
SERVER_PROTOCOL	HTTP protocol version (always "HTTP/1.0")	yes
SERVER_PORT	IP port request was received on	yes
SERVER_SOFTWARE	software ID of HTTP server	yes

CGI Variable Demonstration

The basic CGI symbol names are demonstrated here with a call to a script that simply executes the following DCL code:

  $ SHOW SYMBOL WWW_*
  $ SHOW SYMBOL *

Note how the request components are represented for "ISINDEX"-style searching (third item) and a forms-based query (fourth item).

CGI Compliant Output

Script output must behave in a CGI-compliant fashion (by way of contrast, see 12.5 - Raw HTTP Output). That is, a CGI script may redirect the location of the document, using a Location: header line, or may supply a data stream beginning with a Content-Type: header line. Both must be followed by a blank line.

If the script output begins with a CGI-compliant "Content-Type: text/..." (text document) the HTTPd assumes that output will be line-oriented and requiring HTTP carriage-control (each record/line terminated by a line-feed), and will thereafter ensure each record it receives is correctly terminated before passing it to the client. In this way DCL procedure output (and the VMS CLI in general) is supported transparently. Any other content-type is assumed to be binary and no carriage control is enforced.

12.4.1 - Example DCL Scripts

A simple script to provide the system time might be:

  $ say = "write sys$output"
  $! the next two lines make it CGI-compliant
  $ say "Content-Type: text/plain"
  $ say ""
  $! start of plain-text script output
  $ show time

A script to provide the system time more elaborately (using HTML):

  $ say = "write sys$output"
  $! the next two lines make it CGI-compliant
  $ say "Content-Type: text/html"
  $ say ""
  $! start of HTML script output
  $ say "<HTML>"
  $ say "Hello ''WWW_REMOTE_HOST'"  !(CGI variable)
  $ say "<P>"
  $ say "System time on node ''f$getsyi("nodename")' is:"
  $ say "<H1>''f$cvtime()'</H1>"
  $ say "</HTML>"

12.5 - Raw HTTP Output

A script does not have to output a CGI-compliant data stream. If it begins with a HTTP header status line (e.g. "HTTP/1.0 200 OK"), HTTPd assumes it will supply a raw HTTP data stream, containing all the HTTP requirements. This is the equivalent of the no-parse-header, or "nph..." named scripts of some environments.

Any such script must observe the HyperText Transfer Protocol. Every line must be terminated by a carriage-return and line-feed (represented as "\r""\n"), or as a minimum by a single line-feed. In particular, the type of the data being returned by the scripts must be included in an HTTP header sent prior to the data itself. Headers for the two most common data types will be illustrated here. Note that the blank line is strictly necessary, it terminates the header.

Plain-Text

  HTTP/1.0 200 ok\r\n
  Content-Type: text/plain\r\n
  \r\n

HTML

  HTTP/1.0 200 ok\r\n
  Content-Type: text/html\r\n
  \r\n

Raw HTTP DCL Script

The following example show a non-CGI-compliant DCL script similar in function to the CGI-compliant one above. Note the full HTTP header and each line explicitly terminated with a carriage-return and line-feed pair.

  $ cr[0,8] = %x0d
  $ lf[0,8] = %x0a
  $ say = "write sys$output"
  $! the next line determines it is raw HTTP stream
  $ say "HTTP/1.0 200 Success''cr'''lf'"
  $ say "Content-Type: text/html''cr'''lf'"
  $! response header separating blank line
  $ say "''cr'''lf'"
  $! start of HTML script output
  $ say "<HTML>''lf'"
  $ say "Hello ''WWW_REMOTE_HOST'''lf'"
  $ say "<P>''lf'"
  $ say "Local time is ''WWW_TIME_LOCAL'''lf'"
  $ say "</HTML>''lf'"

Raw HTTP C Script

When scripting using the C programming language and providing a full HTTP response there can be considerable efficiencies to be gained by providing a binary output stream from the script. This may be simply provided using a code construct similar to following to reopen <stdout> in binary mode.

  /* reopen output stream so that the '\r' and '\n' are not filtered */
  #ifdef __DECC
     if ((stdout = freopen ("SYS$OUTPUT", "w", stdout, "ctx=bin")) == NULL)
        exit (vaxc$errno);
  #endif

This is used consistently in WASD scripts. Of course after that the full HTTP header must be supplied.

     fprintf (stdout,
  "HTTP/1.0 200 Success\r\n\
  Content-Type: text/html\r\n\
  \r\n\
  <HTML>\n\
  Hello %s\n\
  <P>\n\
  System time is %s\n\
  </HTML>\n",
     getenv("WWW_REMOTE_HOST"),
     getenv("WWW_TIME_LOCAL"));

12.6 - Raw HTTP Input

The logical name SYS$INPUT (with a synonym of HTTP$INPUT for backward compatibility), <stdin> for C Language based scripts, defines a mailbox providing a stream containing the request body (if any). This is available for procedures and executables to explicitly open and read.

Note that this is a raw stream, and HTTP lines (carriage- return/line-feed terminated sequences of characters) may have be blocked together for network transport. These would need to be explicity parsed by the program.

NOTE: Versions of the server prior to 4.3 supplied the full request (header then body) to the script. This was not fully CGI-compliant. Versions 4.3 and following supply only the body, although the previous behaviour may be explicitly selected using the configuration parameter [DclFullRequest].

12.7 - CGIplus Scripting

Common Gateway Interface ... plus lower latency,
plus greater efficiency,
plus far less system impact!

I know, I know! The term CGIplus is a bit too cute but I had to call it something!

CGIplus attempts to eliminate the overhead associated with creating the subprocess and then executing the image of a CGI script. It does this by allowing the subprocess and any associated image/application to continue executing between uses, eliminating any startup overheads. This reduces both the load on the system and the request latency. In this sense these advantages parallel those offered by commercial HTTP server-integration APIs, such as Netscape NSAPI and Microsoft ISAPI, without the disadvantages of such proprietory interfaces, the API complexity, language dependency and server process integration.

CGIplus is not as complex (and consequently nor as versatile) as another approach to improving CGI performance, Open Market's FastCGI, see http://www.fastcgi.com/

CGIplus design is generic enough to be easily implemented by other server architectures if found desirable. (For example, it is imagined Unix platforms would implement the CGIplus variable stream using named pipes one of which would be designated by the CGIPLUSIN environment variable.) The CGIplus-specific script environment and example code has been made as platform-neutral as possible, providing potential for a more wide-spread adoption. Existing CGI scripts can rapidly and elegantly be modified to additionally support CGIplus. The capability of scripts to easily differentiate between and operate in both standard CGI and CGIplus environments with a minimum of code revision offers great versatility.

CGIplus Performance

A simple performance evaluation indicates the advantage of CGIplus. See 14.2 - Subprocess-based Scripting for some test results comparing the non-persistant-process, persistant-process and CGIplus environments.

Without a doubt, the subjective difference in activating the same script within the standard CGI and CGIplus environments is quite startling!

CGIplus Programming

The script interface is still CGI, which means a new API does not need to be learned and existing CGI scripts are simple to modify.

See examples in HT_ROOT:[SRC.CGIPLUS]

Instead of having the CGI variables available from the environment (generally accessed via the C Language getenv() standard library call) a CGIplus script must read the CGI variables from CGIPLUSIN. They are supplied as a series of records (lines) containing a CGI variable name (in upper-case), an equate symbol and then the variable value. The line will never contain more than 1024 characters. The format may be easily parsed and as the value contains no encoded characters may be directly used.

Requirements when using:

The read will block between subsequent requests and so may be used to coordinate the application.
The first record read in any request can always be discarded. This is provided so that a script may be synchronized outside of the general CGIplus variable read loop (the DCL and Perl examples use this feature).
The CGIplus variable stream should be completely read (up until the blank line, see below) before beginning any request processing.
The CGIplus variable stream should ALWAYS BE COMPLETELY READ (up until the blank line, see below).
An empty record (blank line) indicates the end of a single request's CGIplus variable stream. Reading MUST be halted at this stage. Request processing may then commence.

After processing, the CGIplus script can loop, waiting to read the details of the next request from CGIPLUSIN.

Request output (to the client) is written to SYS$OUTPUT (<stdout>) as per normal CGI behaviour. End of output MUST be indicated by writing a special EOF record to the output stream. This is bit of a kludge, and the least elegant part of CGIplus design, but it is also the simplest implementation. A unique EOF sequence is generated for each use of DCL via a zombie or CGIplus subprocess. A non-repeating series of bits most unlikely to occur in normal output is employed ... but there is still a very, very, very small chance of premature termination of output (one in 2^²⁸⁰ I think!) See CGI.c for how the value is generated.

The CGIplus EOF string is obtained by the script from the logical name CGIPLUSEOF, defined in the script subprocess' process table, using the scripting language's equivalent of F$TRNLNM(), SYS$TRNLNM(), or a getenv() call (in the C standard library). This string will always contain less than 64 characters and comprise only printable characters. It must be written at the conclusion of a request's output to the output stream as a single record (line) but may also contain a <CR><LF> or just <LF> trailing carriage-control (to allow for programming language requirements). It only has to be evaluated once, as the processing begins, remaining the same for all requests over the life-time of that instance of the script.

HTTP input (raw request stream, header and any body) is still available to a CGIplus script.

Code Examples

Of course a CGIplus script should only have a single exit point and should explicitly close files, free allocated memory, etc., after processing a request (i.e. not rely on image run-down to clean-up after itself). It is particularly important when modifying existing scripts to work in the CGIplus environment to ensure this requirement is met (who of us hasn't thought "well, this file will close when the image exits anyway"?)

It is a simple task to design a script to modify it's behaviour according to the environment it is executing in. Detecting the presence or absence of the CGIPLUSEOF logical is sufficient indication. The following C code fragment shows simultaneously determining whether it is a standard or CGIplus environment (and setting an appropriate boolean), and getting the CGIplus EOF sequence (if it exists).

  int  IsCgiPlus;
  char  *CgiPlusEofPtr;

  IsCgiPlus = ((CgiPlusEofPtr = getenv("CGIPLUSEOF")) != NULL);

The following C code fragment shows a basic CGIplus request loop, reading lines from CGIPLUSIN, and some basic processing to select required CGI variables for request processing.

  if (IsCgiPlus)
  {
     char  *cptr;
     char  Line [1024],
           RemoteHost [128];
     FILE  *CgiPlusIn;

     if ((CgiPlusIn = fopen (getenv("CGIPLUSIN"), "r")) == NULL)
     {
        perror ("CGIplus: fopen");
        exit (0);
     }

     for (;;)
     {
        /* will block waiting for subsequent requests */
        for (;;)
        {
           /* should never have a problem reading CGIPLUSIN, but */
           if (fgets (Line, sizeof(Line), CgiPlusIn) == NULL)
           {
              perror ("CGIplus: fgets");
              exit (0);
           }
           /* first empty line signals the end of CGIplus variables */
           if (Line[0] == '\n') break;
           /* remove the trailing newline */
           if ((cptr = strchr(Line, '\n')) != NULL) *cptr = '\0';

           /* process the CGI variable(s) we are interested in */
           if (!strncmp (Line, "WWW_REMOTE_HOST=", 16))
              strcpy (RemoteHost, Line+16);
        }

        (process request, signal end-of-output)
     }
  }

CGI scripts can write output in record (line-by-line) or binary mode (more efficient because of buffering by the C RTL). When in binary mode the output stream must be flushed immediately before and after writing the CGIplus EOF sequence (note that in binary a full HTTP stream must also be used). This code fragment shows placing a script output stream into binary mode and the flushing steps.

  /* reopen output stream so that the '\r' and '\n' are not filtered */
  if ((stdout = freopen ("SYS$OUTPUT", "w", stdout, "ctx=bin")) == NULL)
     exit (vaxc$errno);

  do {

     (read request ...)

     /* HTTP response header */
     fprintf (stdout, "HTTP/1.0 200 ok\r\nContent-Type: text/html\r\n\r\n");

     (other output ...)

     if (IsCgiPlus)
     {
        /* the CGIplus EOF must be an independant I/O record */
        fflush (stdout);
        fprintf (stdout, "%s", CgiPlusEofPtr);
        fflush (stdout);
     }

  } while (IsCgiPlus);

If the script output is not binary (using default <stdout>) it is only necessary to ensure the EOF string has a record-delimiting new-line.

  fprintf (stdout, "%s\n", CgiPlusEofPtr);

Other languages may not have this same requirement. DCL procedures are quite capable of being used as CGIplus scripts.

See examples in HT_ROOT:[SRC.CGIPLUS]

Whenever developing CGIplus scripts/applications (unlike standard CGI) don't forget that after compiling, the old image must be purged from the server before trying out the new!!! (I've been caught a number of times :^)

Scripting subprocesses may be purged or deleted using (see 5.3.2.4 - DCL/Scripting Subprocesses):

  $ HTTPD /DO=DCL=DELETE
  $ HTTPD /DO=DCL=PURGE

Other Considerations

Multiple CGIplus scripts may be executing in subprocesses at any one time. This includes multiple instances of any particular script. It is the server's task to track these, distributing appropriate requests to idle subprocesses, monitoring those currently processing requests, creating new instances if and when necessary, and deleting the least-used, idle CGIplus subprocesses when configurable thresholds are reached. Of course it is the script's job to maintain coherency if multiple instances may result in resource conflicts or race conditions, etc., between the scripts.

The CGIplus subprocess can be given a finite life-time set by configuration parameter (see 6 - Server Configuration). If this life-time is not set then the CGIplus will persist indefinitely (i.e. until purged due to soft-limits being reached, or explicitly purged/deleted). When a life-time has been set the CGIplus subprocess is automatically deleted after being idle for the specified period (i.e. not having processed a request). This can be useful in preventing sporadically used scripts from cluttering up the system indefinitely.

In addition, an idle CGIplus script can be terminated by the server at any time the subprocess soft-limit is reached (the subprocess SYS$DELPRC()ed) so resources should be largely quiescent when not actually processing. Of course a CGIplus subprocesses may also be manually terminated from the command line (e.g. STOP/ID=).

Some CGIplus scripting information and management is available via the server administration menu, see 11.2 - HTTPd Server Reports.

CGIplus Rule Mapping

CGIplus scripts are differentiated from standard CGI scripts in the mapping rule configuration file using the "script+" and "exec+" directives. See 8 - Mapping Rules.

Scripts capable of operating in both standard CGI and CGIplus environments may simply be accessed in either via rules such as

  exec /cgi-bin/* /cgi-bin/*
  exec+ /cgiplus-bin/* /cgi-bin/*

while specific scripts can be individually designated as CGIplus using

  script+ /cgiplus_example* /cgi-bin/cgiplus_example*

Caution! If changing CGIplus script mapping it is advised to restart the server rather than reloading the rules. Some conflict is possible when using new rules while existing CGIplus scripts are executing.

12.8 - DECnet Scripting

"Imitation is the sincerest form of flattery" - proverb

Please Note! WASD requires no additional configuration to support subprocess-based scripting. The following information applies only if DECnet-based scripting is desired.

By default WASD executes scripts within subprocesses, but can also provide scripting using DECnet for the process management. DECnet scripting is not provided to generally supplant the subprocess-based scripting but augment it for certain circumstances:

To provide an environment within WASD where OSU-based scripts (both CGI and OSU-specific) may be employed without modification.
To allow nodes without a full HTTP service to participate in providing resources via a well-known server, possibly resources that only they have access to.
Load-sharing amongst cluster members for high-impact scripts or particularly busy sites.
Provide user-account scripting.

DECnet Performance

Any DECnet based processing incurs some overheads.

connection establishment
NETSERVER image activation
NETSERVER maintenance (such as logs, etc.)
activation of DECnet object image or procedure
DECnet object processing
activation by object of image or procedure
DECnet object run-down
NETSERVER image reactivation on completion of object processing

As of version 5.2 WASD provides reuse of DECnet connections for both CGI and OSU scripting, in-line with OSU v3.3 which provided reuse for OSU scripts. This means multiple script requests can be made for the cost of a single DECnet connection establishment and task object activation. This functionality provides substantial performance improvements as indicated by 14.3 - DECnet-based Scripting. Note that the OSU task procedure requires the definition of the logical name WWW_SCRIPT_MAX_REUSE representing the number of times a script may be reused. The WASD startup procedures can provide this.

In practice both the WASD CGI and OSU scripts seem to provide acceptable responsiveness.

Rule Mapping

DECnet-based scripts are mapped using the same rules as subprocess-based scripts, using the SCRIPT and EXEC rules (see 8 - Mapping Rules for general information on mapping rules). DECnet scripts have a DECnet node and task specification string as part of the mapping rule. There are minor variations within these to further identify it as a WASD or an OSU script. See 12.8.4 - User Scripts for information on mapping user scripting.

The specification string follows basic VMS file system syntax (RMS), preceding the file components of the specification. The following example illustrates declaring that paths beginning with FRODO will allow the execution of scripts from the CGI-BIN:[000000] directory on DECnet node FRODO.

  exec /FRODO/* /FRODO::/cgi-bin/*

In similar fashion the following example illustrates a script "frodo_show" that might do a "SHOW SYSTEM" on node FRODO. Note that these rules are case-insensitive.

  script /frodo-showsys /frodo::/cgi-bin/showsys.com

Both of the above examples would use the WASD CGI DECnet environment (the default if no task specification string is provided). By including task information other environments, in particular the OSU scripting enviroment, can be specified for the script to be executed within. The default task is named CGIWASD and can also be explicitly specified (although this behaviour would be the same as that in the first example)

  exec /frodo/* /frodo::"task=cgiwasd"/cgi-bin/*

All task specification strings may also use zero as the task abreviation.

  exec /frodo/* /frodo::"0=cgiwasd"/cgi-bin/*

To execute a script within the OSU environment specify the standard OSU task executive WWWEXEC, as in the following example:

  exec /osu/* /FRODO::"task=wwwexec"/cgi-bin/*

This would allow any URL beginning with "/osu/" to execute a script in the OSU environment.

Local System

To specify any script to execute on the same system as the HTTP server specify the node name as zero or SYS$NODE.

  exec /decnet/* /0::"task=cgiwasd"/cgi-bin/*
  exec /osu/* /sys$node::"task=wwwexec"/cgi-bin/*

Mapping rules are included in the examples (see HT_ROOT:[EXAMPLE]) providing this. After the DECnet environment has been started any CGI script may be executed on the local system via DECnet by substituting "/decnet/" for "/cgi-bin/" as the script path, and any OSU script available by using "/osu/". Behaviour is indeterminate, though it shouldn't be catastrophic, if one is invoked using the incorrect path (i.e. an OSU script using /decnet/ or a CGI script using /osu/).

12.8.1 - Script System Environment

The target system must have sufficient of the WASD server environment to support the required CGI script activation and activity. If the target system is actually the same system as the HTTP server then it already exists, or if part of the local system's cluster, then providing this should be relatively straight-forward. If the target system has none of the server environment then at a minimum it must have the logical name CGI-BIN defined representing the directory containing the required DECnet object procedure and scripts. The following fragment illustrates this:

  $ DEFINE /SYSTEM /TRANSLATION=(CONCEALED) CGI-BIN device:[dir.]

In this directory must be located the WASDCGI.COM and WWWEXEC.COM procedures required by the network task. Of course other parts of the environment may need to be provided depending on script requirements.

12.8.1.1 - Proxy Access

The local system must have proxy access to each target scripting system (even if that "target" system is the same system as the HTTP server). This involves creating a proxy entry in each target hosts's authorization database. The following example assumes the existance of a local HTTP$SERVER account. If it does not exist on the target node then one must be created with the same security profile as the HTTP server's.

Caution! If unsure of the security implications of this action consult the relevant VMS system management security documentation.

The zero represents the system the server is currently executing on.

  $ SET DEFAULT SYS$SYSTEM
  $ MCR AUTHORIZE
  UAF> ADD /PROXY 0::HTTP$SERVER HTTP$SERVER /DEFAULT

It is necessary to ensure the account has permission to write into it's home directory. A network process creates a NETSERVER.LOG (Phase-IV) or NET$SERVER.LOG (DECnet-Plus) file in the home directory, and will fail to start if it cannot!

12.8.1.2 - DECnet Objects

To provide DECnet scripting DECnet object(s) must be specified for any system on which the scripts will be executed. The DECnet object is the program or procedure that is activated at the target system inside of a network-mode process to interact with the HTTP server.

DECnet-Plus (OSI/Phase-V)

DECnet-Plus uses the NCL utility to administer the network environment. The following NCL scripting shows the creation of a network application for the WASD CGI object:

  $ MCR NCL
  CREATE NODE 0 SESSION CONTROL APPLICATION CGIWASD
  SET NODE 0 SESSION CONTROL APPLICATION CGIWASD ADDRESSES = {NAME=CGIWASD} -
  ,CLIENT =  -
  ,INCOMING ALIAS = TRUE -
  ,INCOMING PROXY = TRUE -
  ,OUTGOING ALIAS = FALSE -
  ,OUTGOING PROXY = TRUE -
  ,NODE SYNONYM = TRUE -
  ,IMAGE NAME = CGI-BIN:[000000]CGIWASD.COM -
  ,INCOMING OSI TSEL =

To create a DECnet-Plus OSU WWWEXEC object:

  $ MCR NCL
  SET NODE 0 SESSION CONTROL APPLICATION WWWEXEC ADDRESSES = {NAME=WWWEXEC} -
  ,CLIENT =  -
  ,INCOMING ALIAS = TRUE -
  ,INCOMING PROXY = TRUE -
  ,OUTGOING ALIAS = FALSE -
  ,OUTGOING PROXY = TRUE -
  ,NODE SYNONYM = TRUE -
  ,IMAGE NAME = CGI-BIN:[000000]WWWEXEC.COM -
  ,INCOMING OSI TSEL =

These must be executed at each system (or server) startup, and may be executed standalone, as illustrated, or incorporated in the NCL script SYS$STARTUP:NET$APPLICATION_STARTUP.NCL for automatic creation at each system startup. Examples may be found in HT_ROOT:[EXAMPLE].

Phase-IV

DECnet Phase-IV uses the NCP utility to administer the network environment. The following NCP commands may be used each time during server startup to create the required DECnet objects. With Phase-IV the SET verb may be replaced with a DEFINE verb and the commands issued just once to permanently create the objects (a SET must also be done that first time to create working instances of the DEFINEd objects).

To create a DECnet CGI object:

  $ MCR NCP
  SET OBJECT CGIWASD NUMBER 0 FILE CGI-BIN:[000000]CGIWASD.COM

To create a DECnet OSU WWWEXEC object:

  $ MCR NCP
  SET OBJECT WWWEXEC NUMBER 0 FILE CGI-BIN:[000000]WWWEXEC.COM

Examples may be found in HT_ROOT:[EXAMPLE].

12.8.1.3 - Reducing Script Latency

Script system network process persistance may be configured using NETSERVER logical names. These can control the number and quiescent period of the server processes. These logical names must be defined in the LOGIN.COM of the HTTP server account on the target script system.

NETSERVER$SERVERS_username - This logical controls the number of network server processes that are kept available at any one time. Defining this logical results in a minimum of the specified number of quiescent server processes maintained. This can improve script response latency by circumventing the need to create a process to service the request, at the cost of cluttering the system with NETSERVER processes.
```
  DEFINE /JOB NETSERVER$SERVERS_HTTP$SERVER 5
```
NETSERVER$TIMEOUT - This logical controls the duration a quiescent network process persists before being deleted. The default period is five minutes. The following examples first show reducing that to thirty seconds, the second increasing it to one hour. Again, this can improve script response latency by circumventing the need to create a process to service the request, at least during the period a previously created process continues to exist.
```
  DEFINE /JOB NETSERVER$TIMEOUT "0 00:00:30"
  DEFINE /JOB NETSERVER$TIMEOUT "0 01:00:00"
```

12.8.1.4 - DECnet/OSU Startup

The example STARTUP.COM and STARTUP_DECNET.COM procedures found in the HT_ROOT:[EXAMPLE] directory provide the essentials for DECnet/OSU scripting. If the INSTALL.COM startup environment is used setting the PROVIDE_DECNET symbol to 1 in STARTUP.COM will create the DECnet scripting environment during server startup.

12.8.2 - CGI

CGI scripts may be transparently executed within the DECnet scripting environment. This means that the script is executed within a network process, on the target system (which could be the local system), instead of within a subprocess on the local system. Other than that the WASD DECnet CGI environment behaves identically to the standard subprocess CGI environment. CGIplus scripting is not supported and if CGIplus-only scripts are executed the behaviour is indeterminate.

An example of making the HELP database on a system other than that hosting the HTTP server (using the CONAN script) would be done using the mapping rules

  map /FRODO/help /FRODO/help/
  script /FRODO/help/* /FRODO::/cgi-bin/conan/*

and for the example DCL SHOW script

  script /FRODO/show* /FRODO::/cgi-bin/show*

12.8.3 - OSU (DECthreads) Emulation

The OSU, or DECthreads, server is the most widely deployed VMS HTTP server environment, authored by David Jones and copyright the Ohio State University. See http://kcgl1.eng.ohio-state.edu/www/doc/serverinfo.html for more information.

The WASD HTTP server provides an emulation of the OSU scripting environment. This is provided so that OSU-based scripts (both CGI-style and OSU-specific) may be employed by WASD with no modification. As this emulation has been designed through examining OSU code and lots of trial and error it's behaviour may be incomplete or present errors. A list of OSU scripts known to work with WASD is provided at the end of this section, see Known Working Scripts.

Supported scripts include only those that depend on the OSU WWWEXEC object and dialog for all functionality. Any script that uses other OSU-specific functionality is not supported. Interactions between WASD's and OSU's authentication/authorization schemes may be expected.

Please remember this is a first-cut of reverse-engineered technology. The author would like to know of any OSU scripts the WASD emulation barfs on, and will attempt to address the associated limitation(s) and/or problem(s).

OSU Setup

Software necessary for supporting the OSU scripting environment (e.g. WWWEXEC.COM) and selected OSU scripts (mainly for testing purposes) have been extracted from the OSU v3.3a package and included in the HT_ROOT:[SRC.OSU] directory. This has been done within the express OSU licensing conditions.

  Copyright 1994,1997 The Ohio State University.  
  The Ohio State University will not assert copyright with respect
  to reproduction, distribution, performance and/or modification 
  of this program by any person or entity that ensures that all 
  copies made, controlled or distributed by or for him or it bear 
  appropriate acknowlegement of the developers of this program.

An example DECnet and OSU scripting startup may be found in HT_ROOT:[EXAMPLE]. This should be called from or used within the HTTP server startup. It includes two logical definitions required for common OSU scripts. Other tailoring may be required for specific OSU scripts.

OSU - General Comments

David Jones, the author of the DECthreads (OSU) HTTP server, outlines his reasons for basing OSUs scripting on DECnet (reproduced from a USENET NEWS reply to a comment this author made about DECnet-based scripting).

  ------------------------------------------------------------------------

  From           JONESD@er6.eng.ohio-state.edu (David Jones)
  Organization   The Ohio State University
  Date           12 Aug 1997 09:04:11 GMT
  Newsgroups     vmsnet.sysmgt,comp.os.vms,comp.infosystems.www.servers.misc
  Message-ID     <5sp8ub$brs$1@charm.magnus.acs.ohio-state.edu>

  ------------------------------------------------------------------------

  ... some text omitted

  Since I was comfortable with DECnet, I based the scripting system
  for the OSU server around it.    The key reasons to use netserver
  processes rather than spawning sub-processes:

      1. DECnet automatically caches and re-uses netserver processes,
         whereas there were well-known performance problems with spawning
         sub-processes.

      2. DECnet processes are detached processes, so you don't worry about
         the effect of scripts consuming pooled quotas (e.g. bytlm) on
         the HTTP server process.

      3. Creation/connection with the DECnet server process is asynchronous
         with respect to the server so other operations can proceed concurrently.
         Spawning is done in supervisor mode, blocking the server's operation
         until the child process is completely initialized.

      4. With DECnet, scripts can be configured to run on different nodes
         for load balancing.

      5. In addition to the standard 'WWWEXEC' object, you can create
         other 'persistent' DECnet objects that the server communicates with
         as scripts. (this was implemented years before OpenMarket's FastCGI
         proposal).

      6. CGI is not the be-all end-all of scripting.  The dialog phase of
         OSU's scripting environment allows scripts to do things CGI
         is incapable of, such as ask the server to translate an arbitrary
         path and not just what followed the script name in the URL.

  People grouse all the time about the installation difficulties caused by
  it's reliance on DECnet,  the reason shown above were cited to show that it
  wasn't made so capricously.

  ... some text omitted

  David L. Jones               |      Phone:    (614) 292-6929
  Ohio State Unviversity       |      Internet:
  2070 Neil Ave. Rm. 122       |               jonesd@kcgl1.eng.ohio-state.edu
  Columbus, OH 43210           |               vman+@osu.edu

  Disclaimer: Dogs can't tell it's not bacon.

The OSU server's DECnet scripting is not based on arbitrary considerations. This author does not disagree with any of the concerns, and as may be seen from WASD documentation the design of WASD also directly addresses points 1, 3 and 5 with the use of persistant subprocesses and CGIplus. Certainly DECnet-based scripting addresses the very legitimate point 4 (and also allows nodes with specific resources to participate without installing full HTTP server environments). For all practical purposes point 2 may be addressed by adjusting process quotas. Point 6 is only too true (possibly at least until Java servers and servlets become ubiquitous :^)

Known Working Scripts

The following is a list of OSU-specific scripts that the WASD v5.1 implementation has either been developed or tested against, and any installation notes or other WASD specifics. The author would like to know of any OSU scripts the WASD emulation has problems or works successfully with.

All of the scripts, etc. provided in the HT_ROOT:[SRC.OSU] directory. These include:
- cgi_symbols
- cgi-mailto
- html_preproc
- set_dcl_env
- testcgi
- testform
- tmail
- vmshelpgate
- webbook

helpgate

Comment out the Conan The Librarian mappings for the "/help" path and provide the following in HTTPD$MAP:

  # first make "/help" into a script specification
  map /help* /htbin/helpgate/help*
  # general rule mapping "/htbin" to OSU DECnet scripts
  exec /htbin/* /0::"0=wwwexec"/cgi-bin/*
  # map the non-script part of the path back to just "/help"
  pass /htbin/helpgate/help* /help*

It is possible to support both HELP environments (although helpgate will not work without owning the "/help" path), merely provide another mapping for Conan with a slightly different path, for example:

  map /chelp /chelp/
  script /chelp/* /cgi-bin/conan/*

HTML pre-processor
Yes, backward compatibility can be provided for those old OSU .HTMLX files in your new WASD environment ;^) All that is needed is a file type mapping to the script in the HTTPD$CONFIG configuration file.
```
  [AddType]
  .HTMLX  text/html  /htbin/html_preproc  OSU SSI HTML
```
showtime
mgmt

12.8.4 - User Scripts

The WASD DECnet environment provides a simple mechanism for executing scripts within accounts other than the server's. This allows configured users to write and maintain scripts within their own areas and have them execute as themselves. Both standard CGI and OSU scripting may be provided for with this facility.

Of course there is always a down-side. Be careful to whom this capability is granted. User scripts are executed within a user network-mode process created by DECnet. Script actions cannot generally affect server behaviour, but they can access any WORLD-readable and modify any WORLD-writable resource in the system/cluster, opening a window for information leakage or mischievous/malicious actions. Script authors should be aware of any potential side-effects of their scripts and Web administrators vigilant against possible destructive behaviours of scripts they do not author.

User scripting is not enabled by default. To provide this facility mapping rules into the user area must be provided in much the same way as for user directories, See 8.6 - Mapping User Directories (tilde character ("~")) .

The "EXEC" rule provides a wildcard representation of users' script paths. As part of this mapping a subdirectory specifically for the web scripts should always be included. Never map users' top-level directories. For instance if a user's account home directory was located in the area WWW_USER:[DANIEL] the following rule would potentially allow the user DANIEL to provide scripts from the home subdirectory [.WWW.CGI-BIN] using the accompanying rules (first for CGI, second for OSU scripts):

  exec /~*/cgi-bin/* /0""::/www_user/*/www/cgi-bin/*
  exec /~*/osu-bin/* /0""::"0=wwwexec"/www_user/*/www/cgi-bin/*

Scripts located in these directories are accessable via paths such as the following:

  /~daniel/cgi-bin/test

Explicit User Account

Using mapping rules it is possible to explicitly specify the user account for a particular script or scripts to be executed within. This may be useful if an application has quota or other resource requirements that are desired to be withheld from the HTTP server account (i.e. it can provide a measure of isolation between the server and application accounts).

  exec /whatever-bin/* /0"WHATEVER"::/whatever_root/cgi-bin/*
  script /dowhatever/* /0"WHATEVER"::/whatever_root/cgi-bin/dowhatever/*

Proxy Access

For each user account permitted to execute local scripts proxy access to that account must be granted to the HTTP server account.

Caution! If unsure of the security implications of this action consult the relevant VMS system management security documentation.

  $ SET DEFAULT SYS$SYSTEM
  $ MCR AUTHORIZE
  UAF> ADD /PROXY <node>::HTTP$SERVER <account>

For example, the following would allow the HTTP server to execute scripts on behalf of the username DANIEL.

  UAF> ADD /PROXY 0::HTTP$SERVER DANIEL

12.9 - Java Scripts

Java classes may be used to perform CGI scripting with WASD. They may be designed as standard CGI scripts (with the inevitable latency of the class loading) or as CGIplus scripts (with the attendant benefit of lower latency).

Note that Java scripts must always be mapped and executed using the CGIplus path, however some can behave as standard CGI scripts, exiting after responding to the request, while others can persist, responding to multiple requests (see 12.7 - CGIplus Scripting). The CGIplus path is always necessary as Java does not have direct access to a process' general environment, the traditional way of passing CGI variables, so the WASD implementation uses the CGIplus data stream to provide CGI information.

WASD provides a class to allow a relatively simple interface to the CGI environment for both GET and POST method scripts. This and a collection of demonstration scripts may be found in the HT_ROOT:[SRC.JAVA] directory.

Developed using the first-release JDK1.1 beta kit for OpenVMS Alpha V7.1.

Requirements

Ensure the Java class file type is mapped to the Java run-time in the HTTPD$CONFIG configuration file.
```
  [DclScriptRunTime]
  .CLASS  @CGI-BIN:[000000]JAVA.COM
```

The following content types are configured, also in HTTPD$CONFIG.

  [AddType]
  .CLASS  application/octet-stream  -  Java class
  .JAVA  text/plain  -  Java source
  .JAR  application/octet-stream  -  Java archive
  .PROPERTIES  text/plain  -  Java properties

The CGI-BIN logical includes the HT_ROOT:[JAVA] class directory in the server startup.

  $ JAVA_ROOT = F$TRNLNM("HT_ROOT") - ".]" + ".JAVA.]"
  $ DEFINE /SYSTEM /TRANSLATION=(CONCEALED) -
           CGI-BIN 'EXE_ROOT','SCRIPT_LOCAL_ROOT', -
           'SCRIPT_ROOT','JAVA_ROOT'

12.10 - HTTP Persistant-State Cookies

The WASD server is cookie-aware. That is, if the client supplies a "Cookie:" request header line it is passed to a CGI script as "WWW_HTTP_COOKIE" CGI variable symbol. If a cookie is not part of the request this symbol does not exist. A script may use the "Set-Cookie:" response header line to set cookies.

Here is a small demonstration of cookie processing using a DCL script.

[next] [previous] [contents] [full-page]