WASD Hypertext Services - Scripting Environment

1 - Introduction

1.1 - Scripting Processes
    1.1.1 - Detached Process Scripting
        1.1.1.1 - Non-Server Account Scripting
        1.1.1.2 - Restricting Non-Server Scripting
        1.1.1.3 - Process Priorities
    1.1.2 - Subprocess Scripting
1.2 - Script Mapping
1.3 - Script Run-Time
1.4 - Scripting Logicals
1.5 - Scripting Scratch Space
1.6 - DCL Processing of Requests
1.7 - Scripting Function Library
1.8 - HTTP Persistant-State Cookies
[next] [previous] [contents] [full-page]

This document is not a general tutorial on authoring scripts, CGI or any other.  A large number of references in the popular computing press covers all aspects of this technology, usually quite comprehensively.  The information here is about the specifics of scripting in the WASD environment, which for CGI and ISAPI is generally very much like any other implementation.  (Although there are always annoying idiosyncracies, see 1.7 - Scripting Function Library for a partial solution to smoothing out some of these wrinkles.)

Scripts are mechanisms for creating simple HTTP services, sending data to (and sometimes receiving data from) a client, extending the capabilities of the basic HTTPd.  Anything that can write to SYS$OUTPUT can be used to generate script output.  A DCL procedure or an executable can be the basis for a script.  Simply TYPE-ing a file can be provide script output.  Scripts execute in processes separate from the actual HTTP server but under it's control and interacting with it. 

WASD manages a script's process environment either as a dependent subprocess or independent detached process created by the HTTP server, or as a network process created using DECnet.  By default it supports subprocess CGI scripts without further configuration. 

WASD scripting can deployed in a number of environments.  Other chapters cover the specifics of these. 


1.1 - Scripting Processes

Process creation under the VMS operating system is notoriously slow and expensive.  This is an inescapable overhead when scripting via child processes.  An obvious strategy is to avoid, at least as much as possible, the creation of these processes.  The only way to do this is to share processes between multiple scripts/requests, addressing the attendant complications of isolating potential interactions between requests.  These could occur through changes made by any script to the process' enviroment.  For VMS this involves symbol and logical name creation, and files opened at the DCL level.  In reality few scripts need to make logical name changes and symbols are easily removed between uses.  DCL-opened files are a little more problematic, but again, in reality most scripts doing file manipulation will be images. 

A reasonable assumption is that for almost all environments scripts can quite safely share processes with great benefit to response latency and system impact (see "Technical Overview, Performance") for a table with some comparative performances).  If the local environment requires absolute script isolation for some reason then this process-persistance may easily be disabled with a consequent trade-off on performance. 


Zombies

The term zombie is used to describe processes when persisting between uses (the reason should be obvious, they are neither "alive" (processing a request) nor are they "dead" (deleted) :^) Zombie processes have a finite time to exist (non-life-time?) before they are automatically purged from the system (see see "Technical Overview, Configuration").  This keeps process clutter on the system to a minimum. 


1.1.1 - Detached Process Scripting

With WASD it is possible to execute scripts in processes created completely independently of the server process itself.  This offers a significant number of advantages over subprocesses

without too many disadvantages

Creation of a detached process is slightly more expensive in terms of system resources and initial invocation response latency (particularly if extensive login procedures are required), but this quickly becomes negligable as most script processes are used multiple times for successive scripts and/or requests. 


Enabling Detached Processes

By default the server uses subprocesses for scripting (also the historical method by which WASD executes scripts).  The HTTPD$CONFIG directive [DclDetachProcess] when enabled has the server create (almost) completely independent detached processes to execute scripts. 

  [DclDetachProcess]  enabled

When using detached processes, during shutdown the server must explicitly ensure that each scripting process is removed from the system (with subprocesses the VMS executive provides this automatically).  This is performed by the server exit handler.  With VMS it is possible to bypass the exit handler (using a $DELPRC or it's equivalent $STOP/ID= for instance), making it possible for "orphaned" scripting processes to remain - and potentially accumulate on the system! 

To address this possibility, during startup the server scans the system for candidate processes.  These are identified by a terminal mailbox (SYS$COMMAND device), and then further that the mailbox has an ACL with two entries; the first identifying itself as a WASD HTTPd mailbox and the second allowing access to the account the script is being executed under.  Such a device ACL looks like the following example. 

  Device MBA335:, device type local memory mailbox, is online, record-oriented
    device, shareable, mailbox device.
 
    Error count                    0    Operations completed                  0
    Owner process                 ""    Owner UIC             [WEB,HTTP$SERVER]
    Owner process ID        00000000    Dev Prot              S:RWPL,O:RWPL,G,W
    Reference count                1    Default buffer size                2048
    Device access control list:
      (IDENTIFIER=WASD_HTTPD_80,ACCESS=NONE)
      (IDENTIFIER=[WEB,HTTP$SERVER],ACCESS=READ+WRITE+PHYSICAL+LOGICAL)

This rights identifier is generated from the server process name and is therefore system-unique (so multiple autonomous servers will not accidentally cleanup the script processes of others), and is created during server startup if it does not already exist.  For example, if the process name was "HTTPd:80" (the default for a standard service) the rights identifier name would be "WASD_HTTPD_80" (as shown in the example above). 


1.1.1.1 - Non-Server Account Scripting

Generally when a script executes it is within a process owned by the server account.  There are often advantages to running a script under another account.  The most obvious of these is the security isolation it offers with respect to the rest of the Web and server environment.  It also means that the server account does not need to be resourced especially for any particularly demanding application. 

Non-server account scripting requires detached processes be enabled, and the $PERSONA system services available with VMS V6.2 and later.  Non-server account scripting is not available under earlier versions of VMS. 


Enabling Non-Server Scripting

The $PERSONA functionality must be explicitly enabled at server startup using the /PERSONA qualifier (see "Technical Overview, Server Account and Environment").  The ability for the server to be able to execute scripts under any user account is a very powerful (and potentially dangerous) capability, and so is designed that the site administrator must explicitly and deliberately enable the functionality.  Configuration files need to be rigorously protected against unauthorized modification. 

A specific script or directory of scripts can be designated for execution under a specified account using the HTTPD$MAP configuration file SET script=as= mapping rule.  The following example illustrates the essentials. 

  # one script to be executed under the account
  SET  /cgi-bin/a_big_script*  script=as=BIG_ACCOUNT
  # all scripts in this area to be executed under this account
  SET  /database-bin/*  script=as=DBACCNT


User Account Scripting

In some situations it may be desirable to allow the average Web user to experiment with or implement scripts.  If the "script=as=" mapping rule specifies a circumflex character then for a user request the mapped SYSUAF username is substituted. 

The following example shows the essentials of setting up a user environment where access to a subdirectory in the user's home directory, [.WWW] with script's located in a subdirectory of that, [.WWW.CGI-BIN].

  SET   /~*/www/cgi-bin/*  script=as=~
  UXEC  /~*/cgi-bin/*  /*/www/cgi-bin/*
  USER  /~*/*  /*/www/*
  REDIRECT  /~*  /~*/
  PASS  /~*/*  /dka0/users/*/*
To enable user CGIplus scripting include something like
  UXEC+  /~*/cgiplus-bin/*  /*/www/cgi-bin/*


Authenticated User Scripting

If the "script=as=" mapping rule specifies a dollar then a request that has been SYSUAF authenticated has the SYSUAF username substituted. 

  SET   /cgi-bin/cgi_process  script=as=$


1.1.1.2 - Restricting Non-Server Scripting

By default, activating the /PERSONA server startup qualifier allows all the modes described above to be deployed using appropriate mapping rules.  Of course there may be circumstances where such broad capabilities are inappropriate or otherwise undesirable.  It is possible to control which user accounts are able to be used in this fashion with a rights identifier.  Only those accounts granted the identifier can have scripts activated under them.  This means all accounts ... including the server account! 

This is enabled by specifying the name of a rights identifier as a parameter to the /PERSONA qualifier.  This may be any identifier but the one shown in the following example is probably as good as any. 

  $ HTTPD /PERSONA=WASD_SCRIPTING

This identifier could be created using the following commands

  $ SET DEFAULT SYS$SYSTEM
  $ MCR AUTHORIZE
  UAF> ADD /IDENTIFIER WASD_SCRIPTING
and granted to accounts using
  UAF> GRANT /IDENTIFIER WASD_SCRIPTING HTTP$SERVER


1.1.1.3 - Process Priorities

When detached processes are created they can be assigned differing priorities depending on the origin and purpose.  The objective is to give the server process a slight advantage when competing with scripts for system resources.  This allows the server to respond to new requests more quickly (reducing latency) even if a script may then take some time to complete the request. 

The allocation of base process priorities is determined from the HTTPD$CONFIG [DclDetachProcessPriority] configuration directive, which takes one or two (comma-separated) integers that determine how many priorities lower than the server scripting processes are created.  The first integer determines server processes.  A second, if supplied, determines user scripts.  User scripts may never be a higher priority that server scripts.  The following provides example directives. 

  [DclDetachProcessPriority]  1
  [DclDetachProcessPriority]  0,1
  [DclDetachProcessPriority]  1,2

Scripts executed under the server account, or those created using a mapped username (i.e. "script=as=username"), have a process priority set by the first/only integer. 

Scripts activated from user mappings (i.e. "script=as=~" or "script=as=$") have a process priority set by any second integer, or fall back to the priority of the first/only integer. 


1.1.2 - Subprocess Scripting

WASD's default (and historical) scripting environment is with subprocesses created by the server. 

With persistent subprocess scripting the pooled-resource BYTLM can become a particular issue.  After the first subprocess-based script is executed the WATCH report provides some information on the BYTLM required to support both the desired number of incoming network connections and script subprocess IPC mailboxes.  When using these numbers to resource the BYTLM quota of the server account keep in mind that as well as server-subprocess IPC consumption of BYTLM there may be additional requirements whatever processing is performed by the script. 

For a standard configuration 15,000 bytes should be allowed for each possible script subprocess, 1,000 bytes for each potential client network connection, an additional 20,000 bytes overhead, plus any additional requirements for script processing, etc.  Hence for a maximum of 30 scripts and 100 network clients, a BYTLM of approximately 260,000 minimum should be allowed. 


Subprocess Environment

When the subprocess is spawned by the server none of the parent's environment is propagated.  Hence the subprocess has no symbols, logical names, etc., created by the site's SYLOGIN.COM, the server account's LOGIN.COM, etc.  This is done quite deliberately to provide a pristine and standard default environment for the script's execution.  For this reason all scripts must provide all of their required environment to operate.  In particular, if a verb is made available via a SY/LOGIN.COM this will not be available to the script.  Verbs available via the DCLTABLES.EXE or DCL$PATH of course will be available. 

There are two basic methods for supplying a script with a required environment. 


Caution! 

When scripts are executed within unprivileged subprocesses created by the HTTP server, the processes are owned by the HTTP server account (HTTP$SERVER).  Script actions could potentially affect server behaviour.  For example it is possible for a script to issue an "HTTPD/DO=EXIT=NOW" command, or for subprocesses to create or modify logical name values in the JOB table (e.g. change the value of LNM$FILE_DEV altering the logical search path).  Obviously these types of actions are undesirable.  In addition scripts can access any WORLD-readable and modify any WORLD-writable resource in the system/cluster, opening a window for information leakage or mischievous/malicious actions (some might argue that anyone with important WORLD-accessable resources on their system deserves all that happens to them - but we know they're out there :^) Script authors should be aware of any potential side-effects of their scripts and Web administrators vigilant against possible malicious behaviours of scripts they do not author. 


1.2 - Script Mapping

Scripts are enabled using the exec/uxec or script rules in the mapping file (also see "Technical Overview, Mapping Rules").  The script portion of the result must be a URL equivalent of the physical VMS procedure or executable specification. 

All files in a directory may be mapped as scripts using the exec rule.  For instance, in the HTTPD$MAP configuration file can be found a rule

  exec /cgi-bin/* /cgi-bin/*
which results in request paths beginning "/cgi-bin/" having the following path component mapped as a script.  Hence a path "/cgi-bin/cgi_symbols.com" will result in the server attempting to execute a file named CGI-BIN:[000000]CGI_SYMBOLS.COM. 

Multiple such paths may be designated as executable, with their contents expected to be scripts, either directly executable by VMS (e.g. .EXEs and .COMs) or processable by a designated interpreter, etc., (e.g. .PLs, .CLASSes).  See 1.3 - Script Run-Time below. 

In addition individual files may be specified as scripts.  This is done using the script rule.  In the following example the request path "/help" activates the "Conan The Librarian" script. 

  script /help* /cgi-bin/conan*

Of course, multiple such rules may be used to map such abbreviated or self-explanatory script paths to the actual script providing the application. 


Mapping Local or Third-Party Scripts

It is not necessary to move/copy scripts into the server directory structure to make them accessable.  In fact there are probably good reasons for not doing so!  For instance, it keeps a package together so that at the next upgrade there is no possibility of the "server-instance" of that application being overlooked. 

To make scripts provided by third party packages available for server activation two requirements must be met. 

Most packages having such an interface for Web server access would provide details on mapping into the package directory.  For illustration the following mapping rules provide access to a package's scripts (assuming it provides more than one) and also into a documentation area. 

The hypothetical "Application X" directory locations are

  APPLICATIONX_ROOT:[DOC]
  APPLICATIONX_ROOT:[CGI-BIN]

The required mapping rules would be

  pass /applicationX/* /applicationX_root/docs/*
  exec /appX-bin/* /applicationX_root/cgi-bin/*

Access to X's scripts would be using a path such as

  http://the.host.name/appx-bin/main_script?plus=some&query=string


"Wrapping" Local or Third-Party Scripts

Sometimes it may be necessary to provide a particular non-WASD, local, or third-party script with particular environment in which to execute.  This can be provided by wrapping the script executable or interpreted script in a DCL procedure (of course, if the local or third-party script is already activated by a DCL procedure, then that may need to be directly modified).  Simply create a DCL procedure, in the same directory as the script executable, containing the required environmental commands. 

For example, the following DCL procedure defines a scratch directory and provides the location of the configuration file.  It is assumed the script executable is APPLICATIONX_ROOT:[CGI-BIN]APPX.EXE and the script wrapper APPLICATIONX_ROOT:[CGI-BIN]APPX.COM. 

  $! wrapper for APPX CGI executable
  $ SET DEFAULT APPLICATIONX_ROOT:[000000]
  $ DEFINE /USER SYS$SCRATCH APPLICATIONX_ROOT:[SCRATCH]
  $ APPX == "$APPLICATIONX_ROOT:[CGI-BIN]APPX"
  $ APPX /CONFIG=APPLICATIONX_ROOT:[CONFIG]APPX.CONF


1.3 - Script Run-Time

A script is merely an executed or interpreted file.  Although by default VMS executables and DCL procedures can be used as scripts, other environments may also be configured.  For example, scripts written for the Perl language may be transparently given to the Perl interpreter in a script subprocess.  This type of script activation is based on a unique file type (extension following the file name), for the Perl example this is most commonly ".PL", or sometimes ".CGI".  Both of these may be configured to automatically invoke the site's Perl interpreter, or any other for that matter. 

This configuration is performed using the HTTPD$CONFIG [DclScriptRunTime] directive, where a file type is associated with a run-time interpreter.  This parameter takes two components, the file extension and the run-time verb.  The verb may be specified as a simple, globally-accessable verb (e.g. one embedded in the CLI tables), or in the format to construct a foreign-verb, providing reasonable versatility.  Run-time parameters may also be appended to the verb if desired.  The server ensures the verb is foreign-assigned if necessary, then used on a command line with the script file name as the final parameter to it. 

The following is an example showing a Perl interpreter being specified.  The first line assumes the "Perl" verb is globally accessable on the system (e.g. perhaps provided by the DCL$PATH logical) while the second (for the sake of illustration) shows the same Perl interpreter being configured for a different file type using the foreign verb syntax. 

  [DclScriptRunTime]
  .PL PERL
  .CGI $PERL_EXE:PERL

A file contain a Perl script then may be activated merely by specifying a path such as the following

  /cgi-bin/example.pl

To add any required parameters just append them to the verb specified. 

  [DclScriptRunTime]
  .XYZ XYZ_INTERPRETER -vms -verbose -etc
  .XYZ $XYZ_EXE:XYZ_INTERPRETER /vms /verbose /etc

If a more complex run-time interpreter is required it may be necessary to wrap the script's execution in a DCL procedure. 


Script File Extensions

The WASD server does not require a file type (extension) to be explicitly provided when activating a script.  This can help hide the implementation detail of any script.  If the script path does not contain a file type the server searches the script location for a file with one of the known file types, first ".COM" for a DCL procedure, then ".EXE" for an executable, then any file types specified using script run-time configuration directive, in the order specified. 

For instance, the script activated in the Perl example above could have been specified as below and (provided there was no "EXAMPLE.COM" or "EXAMPLE.EXE" in the search) the same script would have been executed. 

  /cgi-bin/example


1.4 - Scripting Logicals

Two logicals provide some control of and input to the DCL subprocess scripting environment (which includes standard CGI, CGIplus and ISAPI, DECnet-based CGI, but excludes DECnet-based OSU). 

Note that most WASD scripts also contain logical names that can be set for debugging purposes.  These are generally in the format script_name$DBUG and if exist activate debugging statements throughout the script. 


1.5 - Scripting Scratch Space

Scripts often require temporary file space during execution.  Of course this can be located anywhere the scripting account (most often HTTP$SERVER) has appropriate access.  The WASD package does provide a default area for such purposes with permissions set during startup to allow the server account full access.  The default area is located in

  HT_ROOT:[SCRATCH]
as is accessed by the server and scripts using the logical name
  HT_SCRATCH:

The server provides for the routine clean-up of old files in HT_SCRATCH: left behind by aborted or misbehaving scripts (although as a matter of design all scripts should attempt to clean up after themselves).  The HTTPD$CONFIG directives

  [DclCleanupScratchMinutesMax]
  [DclCleanupScratchMinutesOld]
control how frequently the clean-up scan occurs, and how old files need to be before being deleted.  Whenever script processes are active the scratch area is scanned at the maximum period specified, or whenever the last script process is purged from the system by the server. 

Of course there is always the potential for interaction between scripts using a common area for such purposes.  At the most elemetary, care must be taken to ensure unique file name are generated.  At worst there is the potential for malicious interaction and information leakage.  Use such common areas with discretion. 


Unique File Names - DCL

The "UNIQUE_ID" CGI variable provides a unique 19 character alpha-numeric string (see UNIQUE_ID Note) suitable for many uses including the type extension of temporary files.  The following DCL illustrates the essentials of generating a script-unqiue file name.  For mutliple file names add further text to the type, as shown below. 

  $ SCRATCH_DIR = "HT_SCRATCH:"
  $ PROC_NAME = F$PARSE(F$ENVIRONMENT("PROCEDURE"),,,"NAME")
  $ INFILE_NAME = SCRATCH_DIR + PROC_NAME + "." + WWW_UNIQUE_ID + "_IN"
  $ OUTFILE_NAME = SCRATCH_DIR + PROC_NAME + "." + WWW_UNIQUE_ID + "_OUT"


Unique File Names - C Language

A similar approach can be used for script coded using the C language, with the useful capacity to mark the file for delete-on-close (of course this is only really useful if it's, say, only to be written, rewound and then re-read without closing first - but I'm sure you get the idea). 

  #define HT_SCRATCH "HT_SCRATCH:"
  #define SCRIPT_NAME "EXAMPLE"
 
  char  *unqiueId;
  char  tmpFileName [256];
  FILE  *tmpFile;
 
  if ((uniqueId = getenv("WWW_UNIQUE_ID")) == NULL)
  {
     printf ("Error: WWW_UNIQUE_ID absent!\n");
     exit (1);
  }
  sprintf (tmpFileName, HT_SCRATCH SCRIPT_NAME ".%s", uniqueId);
 
  if ((tmpFile = fopen (tmpFileName, "w+", "fop=dlt")) == NULL)
     exit (vaxc$errno); 


1.6 - DCL Processing of Requests

To assist with the processing of request content and response generation from within DCL procedures the CGIUTL utility is available in

HT_ROOT:[SRC.MISC]

Functionality includes

Most usefully it can read the request body, decoding form-URL-encoded contents into DCL symbols and/or a scratch file, allowing a DCL procedure to easily and effectively process this form of request. 


1.7 - Scripting Function Library

A source code collection of C language functions useful for processing the more vexing aspects of CGI and general script programming is available in CGILIB.  This and an example implementation is available in

HT_ROOT:[SRC.MISC]

Functionality includes

The WASD scripts use this library extensively and may serve as example applications. 


1.8 - HTTP Persistant-State Cookies

The WASD server is cookie-aware.  That is, if the client supplies a "Cookie:" request header line it is passed to a CGI script as "WWW_HTTP_COOKIE" CGI variable symbol.  If a cookie is not part of the request this symbol does not exist.  A script may use the "Set-Cookie:" response header line to set cookies. 

Here is a small demonstration of cookie processing using a DCL script.


[next] [previous] [contents] [full-page]