=;The OpenVMS Frequently Asked Questions (FAQ)D

The OpenVMS Frequently Asked Questions (FAQ)



 r \ ^  
PreviousContentsIndex

U

15.6.1 OpenVMS Cluster Communications Protocol Details?



AThe following sections contain information on the OpenVMS System ?Communications Services (SCS) Protocol. Cluster terminology is Lavailable in Section 15.6.1.2.1.S

15.6.1.1 OpenVMS Cluster (SCS) over DECnet? Over IP?



>The OpenVMS Cluster environment operates over various network Eprotocols, but the core of clustering uses the System Communications EServices (SCS) protocols, and SCS-specific network datagrams. Direct (full) connectivity is assumed.

FAn OpenVMS Cluster does not operate over DECnet, nor over IP.

&No SCS protocol routers are available.

CMany folks have suggested operating SCS over DECnet or IP over the Cyears, but SCS is too far down in the layers, and any such project Hwould entail a major or complete rewrite of SCS and of the DECnet or IP Gdrivers. Further, the current DECnet and IP implementations have large Etracts of code that operate at the application level, while SCS must @operate in the rather more primitive contexts of the system and Gparticularly the bootstrap---to get SCS to operate over a DECnet or IP Gconnection would require relocating major portions of the DECnet or IP Gstack into the kernel. (And it is not clear that the result would even -meet the bandwidth and latency expectations.)

AThe usual approach for multi-site OpenVMS Cluster configurations Hinvolves FDDI, Memory Channel (MC2), or a point-to-point remote bridge, Dbrouter, or switch. The connection must be transparent, and it must Coperate at 10 megabits per second or better (Ethernet speed), with Glatency characteristics similar to that of Ethernet or better. Various 4sites use FDDI, MC2, ATM, or point-to-point T3 link.W

15.6.1.2 Configuring Cluster SCS for path load balancing?



?This section discusses OpenVMS Cluster communications, cluster Cterminology, related utilities, and command and control interfaces.D

15.6.1.2.1 Cluster Terminology?



FSCS: Systems Communication Services. The protocol used to communicate Ebetween VMSCluster systems and between OpenVMS systems and SCS-based Estorage controllers. (SCSI-based storage controllers do not use SCS.)

GPORT: A communications device, such as DSSI, CI, Ethernet or FDDI. EachG CI or DSSI bus is a different local port, named PAA0, PAB0, PAC0 etc. 9 All Ethernet and FDDI busses make up a single PEA0 port.

EVIRTUAL CIRCUIT: A reliable communications path established between aH pair of ports. Each port in a VMScluster establishes a virtual circuit ' with every other port in that cluster.

DAll systems and storage controllers establish "Virtual Circuits" to ;enable communications between all available pairs of ports.

SYSAP: A "system application"H that communicates using SCS. Each SYSAP communicates with a particular & remote SYSAP. Example SYSAPs include:

(VMS$DISK_CL_DRIVER connects to MSCP$DISK
4The disk class driver is on every VMSCluster system.F MSCP$DISK is on all disk controllers and all VMSCluster systems that ) have SYSGEN parameter MSCP_LOAD set to 1

(VMS$TAPE_CL_DRIVER connects to MSCP$TAPE
4The tape class driver is on every VMSCluster system.F MSCP$TAPE is on all tape controllers and all VMSCluster systems that * have SYSGEN parameter TMSCP_LOAD set to 1

)VMS$VAXCLUSTER connects to VMS$VAXCLUSTER
BThis SYSAP contains the connection manager, which manages cluster ?connectivity, runs the cluster state transition algorithm, and Fimplements the cluster quorum algorithm. This SYSAP also handles lock <traffic, and various other cluster communications functions.

(SCS$DIR_LOOKUP connects to SCS$DIRECTORY
3This SYSAP is used to find SYSAPs on remote systems

MSCP and TMSCP
GThe Mass Storage Control Protocol and the Tape MSCP servers are SYSAPs Gthat provide access to disk and tape storage, typically operating over GSCS protocols. MSCP and TMSCP SYSAPs exist within OpenVMS (for OpenVMS Bhosts serving disks and tapes), within CI- and DSSI-based storage Gcontrollers, and within host-based MSCP- or TMSCP storage controllers. HMSCP and TMSCP can be used to serve MSCP and TMSCP storage devices, and Dcan also be used to serve SCSI and other non-MSCP/non-TMSCP storage devices.

ESCS CONNECTION: A SYSAP on one node establishes an SCS connection to Dits counterpart on another node. This connection will be on ONE AND +ONLY ONE of the available virtual circuits.O

15.6.1.2.2 Cluster Communications Control?



HWhen there are multiple virtual circuits between two OpenVMS systems it Gis possible for the VMS$VAXCLUSTER to VMS$VAXCLUSTER connection to use Dany one of these circuits. All lock traffic between the two systems 1will then travel on the selected virtual circuit.

EEach port has a "LOAD CLASS" associated with it. This load Fclass helps to determine which virtual circuit a connection will use. HIf one port has a higher load class than all others then this port will Fbe used. If two or more ports have equally high load classes then the ?connection will use the first of these that it finds. Prior to Eenhancements found in V7.3-1 and later, the load class is static and Gnormally all CI and DSSI ports have a load class of 14(hex), while the FEthernet and FDDI ports will have a load class of A(hex). With V7.3-1 -and later, the load class values are dynamic.

@For instance, if you have multiple DSSI busses and an FDDI, the GVMS$VAXCLUSTER connection will chose the DSSI bus as this path has the Hsystem disk, and thus will always be the first DSSI bus discovered when the OpenVMS system boots.

?To force all lock traffic off the DSSI and on to the FDDI, for Dinstance, an adjustment to the load class value is required, or the DSSI SCS port must be disabled.

?In addition to the load class mechanisms, you can also use the G"preferred path" mechanisms of MSCP and TMSCP services. This Gallows you to control the SCS connections used for serving remote disk Hand tape storage. The preferred path mechanism is most commonly used to Aexplicitly spread cluster I/O activity over hosts and/or storage Bcontrollers serving disk or tape storage in parallel. This can be Fparticularly useful if your hosts or storage controllers individually Dlack the necessary I/O bandwidth for the current I/O load, and must 7thus aggregate bandwidth to serve the cluster I/O load.

EFor related tools, see various utilities including LAVC$STOP_BUS and BLAVC$START_BUS, and see DCL commands including SET PREFERRED_PATH.c

15.6.1.2.3 Cluster Communications Control Tools and Utilities?



0In most OpenVMS versions, you can use the tools:



FThese tools permit you to disable or enable all SCS traffic on the on the specified paths.

FYou can also use a preferred path mechanism that tells the local MSCP Bdisk class driver (DUDRIVER) which path to a disk should be used. DGenerally, this is used with dual-pathed disks, forcing I/O traffic Fthrough one of the controllers instead of the other. This can be used Fto implement a crude form of I/O load balancing at the disk I/O level.

8Prior to V7.2, the preferred path feature uses the tool:



AIn OpenVMS V7.2 and later, you can use the following DCL command:

 

"
$ SET PREFERRED_PATH 




HThe preferred path mechanism does not disable nor affect SCS operations on the non-preferred path.

FWith OpenVMS V7.3 and later, please see the SCACP utility for control ?over cluster communications, SCS virtual circuit control, port selection, and related.N

15.6.2 Cluster System Parameter Settings?



FThe following sections contain details of configuring cluster-related system parameters.d

15.6.2.1 What is the correct value for EXPECTED_VOTES in a VMScluster?



GThe VMScluster connection manager uses the concept of votes and quorum Hto prevent disk and memory data corruptions---when sufficient votes are @present for quorum, then access to resources is permitted. When Esufficient votes are not present, user activity will be blocked. The Gact of blocking user activity is called a "quorum hang", and is better Cthought of as a "user data integrity interlock". This mechanism is Hdesigned to prevent a partitioned VMScluster, and the resultant massive Fdisk data corruptions. The quorum mechanism is expressly intended ;to prevent your data from becoming severely corrupted.

DOn each OpenVMS node in a VMScluster, one sets two values in SYSGEN:AVOTES, and EXPECTED_VOTES. The former is how many votes the node Gcontributes to the VMScluster. The latter is the total number of votes 2expected when the full VMScluster is bootstrapped.

HSome sites erroneously attempt to set EXPECTED_VOTES too low, believing Gthat this will allow when only a subset of voting nodes are present in <a VMScluster. It does not. Further, an erroneous setting in FEXPECTED_VOTES is automatically corrected once VMScluster connections ?to other nodes are established; user data is at risk of severe Ccorruptions during the earliest and most vulnerable portion of the ?system bootstrap, before the connections have been established.

GOne can operate a VMScluster with one, two, or many voting nodes. With Bany but the two-node configuration, keeping a subset of the nodes Hactive when some nodes fail can be easily configured. With the two-node Econfiguration, one must use a primary-secondary configuration (where Hthe primary has all the votes), a peer configuration (where when either Enode is down, the other hangs), or (preferable) a shared quorum disk.

GUse of a quorum disk does slow down VMScluster transitions somewhat -- Fthe addition of a third voting node that contributes the vote(s) that Hwould be assigned to the quorum disk makes for faster transitions---but Bthe use of a quorum disk does mean that either node in a two-node AVMScluster configuration can operate when the other node is down.



/  
Note

EThe quorum disk must be on a non-host-based shadowed disk, though it Gcan be protected with controller-based RAID. Because host-based volume Fshadowing depends on the lock manager and the lock manager depends on Hthe connection manager and the connection manager depends on quorum, it Gis not technically feasible (nor even particularly reliable) to permit ?host-based volume shadowing to protect the quorum disk.


=If you choose to use a quoum disk, a QUORUM.DAT file will be automaticallyHcreated when OpenVMS first boots and when a quorum disk is specified -- Awell, the QUORUM.DAT file will be created when OpenVMS is booted 4without also needing the votes from the quorum disk.

GIn a two-node VMScluster with a shared storage interconnect, typically ?each node has one vote, and the quorum disk also has one vote. EXPECTED_VOTES is set to three.

FUsing a quorum disk on a non-shared interconnect is unnecessary---the Huse of a quorum disk does not provide any value, and the votes assigned Bto the quorum disk should be assigned to the OpenVMS host serving access to the disk.

DFor information on quorum hangs, see the OpenVMS documentation. For Finformation on changing the EXPECTED_VOTES value on a running system, Fsee the SET CLUSTER/EXPECTED_VOTES command, and see the documentation Hfor the AMDS and Availability Manager tools. Also of potential interest Gis the OpenVMS system console documentation for the processor-specific Cconsole commands used to trigger the IPC (Interrrupt Priority LevelH%x0C; IPL C) handler. (IPC is not available on OpenVMS I64 V8.2.) AMDS, FAvailability Manager, and the IPC handler can each be used to clear a ?quorum hang. Use of AMDS and Availability Manager is generally @recommended over IPC, particularly because IPC can cause CLUEXITH bugchecks if the system should remain halted beyond the cluster sanity I timer limits, and because some Alpha consoles and most (all?) Integrity / consoles do not permit a restart after a halt.

DThe quorum scheme is a set of "blade guards" deliberately Fimplemented by OpenVMS Engineering to provide data integrity---remove Gthese blade guards at your peril. OpenVMS Engineering did not Dimplement the quorum mechanism to make a system manager's life more Fdifficult--- the quorum mechanism was specifically implemented to +keep your data from getting scrambled.S

15.6.2.1.1 Why no shadowing for a Quorum Disk?



EStated simply, Host-Based Volume Shadowing uses the Distributed Lock GManager (DLM) to coordinate changes to membership of a shadowset (e.g. Fremoving a member). The DLM depends in turn on the Connection Manager Cenforcing the Quorum Scheme and deciding which node(s) (and quorum Ddisk) are participating in the cluster, and telling the DLM when it Hneeds to do things like a lock database rebuild operation. So you can't Hintroduce a dependency of the Connection Manager on Shadowing to try to ?pick proper shadowset member(s) to use as the Quorum Disk when GShadowing itself is using the DLM and thus indirectly depending on the DConnection Manager to keep the cluster membership straight---it's a circular dependency.

FSo in practice, folks simply depend on controller-based mirroring (or Hcontroller-based RAID) to protect the Quorum Disk against disk failures A(and dual-redundant controllers to protect against most cases of Gcontroller and interconnect failures). Since this disk unit appears to Ebe a single disk up at the VMS level, there's no chance of ambiguity.X

15.6.2.2 Explain disk (or tape) allocation class settings?



FThe allocation class mechanism provides the system manager with a way Dto configure and resolve served and direct paths to storage devices Hwithin a cluster. Any served device that provides multiple paths should Hbe configured using a non-zero allocation class, either at the MSCP (or GTMSCP) storage controllers, at the port (for port allocation classes), Eor at the OpenVMS MSCP (or TMSCP) server. All controllers or servers Dproviding a path to the same device should have the same allocation 1class (at the port, controller, or server level).

FEach disk (or tape) unit number used within a non-zero disk (or tape) Eallocation class must be unique, regardless of the particular device Fprefix. For the purposes of multi-path device path determination, any Fdisk (or tape) device with the same unit number and the same disk (or Ftape) allocation class configuration is assumed to be the same device.

GIf you are reconfiguring disk device allocation classes, you will want Eto avoid the use of allocation class one ($1$) until/unless you have FFibre Channel storage configured. (Fibre Channel storage specifically 8requires the use of allocation class $1$. eg: $1$DGA0:.)a

15.6.2.2.1 How to configure allocation classes and Multi-Path SCSI?



FThe HSZ allocation class is applied to devices, starting with OpenVMS EV7.2. It is considered a port allocation class (PAC), and all device Hnames with a PAC have their controller letter forced to "A". (You might ?infer from the the text in the "Guidelines for OpenVMS Cluster FConfigurations" that this is something you have to do, though OpenVMS 0will thoughtfully handle this renaming for you.)

>You can force the device names back to DKB by setting the HSZ Gallocation class to zero, and setting the PKB PAC to -1. This will use Fthe host allocation class, and will leave the controller letter alone E(that is, the DK controller letter will be the same as the SCSI port H(PK) controller). Note that this won't work if the HSZ is configured in Gmultibus failover mode. In this case, OpenVMS requires that you use an allocation class for the HSZ.

CWhen your configuration gets even moderately complex, you must pay Bcareful attention to how you assign the three kinds of allocation Cclass: node, port and HSZ/HSJ, as otherwise you could wind up with 7device naming conflicts that can be painful to resolve.

FThe display-able path information is for SCSI multi-path, and permits Fthe multi-path software to distinguish between different paths to the Gsame device. If you have two paths to $1$DKA100, for example by having Htwo KZPBA controllers and two SCSI buses to the HSZ, you would have two >UCBs in a multi-path set. The path information is used by the :multi-path software to distinguish between these two UCBs.

GThe displayable path information describes the path; in this case, the DSCSI port. If port is PKB, that's the path name you get. The device Hname is no longer completely tied to the port name; the device name now Ddepends on the various allocation class settings of the controller, SCSI port or node.

EThe reason the device name's controller letter is forced to "A" when @you use PACs is because a shared SCSI bus may be configured via Hdifferent ports on the various nodes connected to the bus. The port may Ebe PKB on one node, and PKC on the other. Rather obviously, you will Hwant to have the shared devices use the same device names on all nodes. BTo establish this, you will assign the same PAC on each node, and FOpenVMS will force the controller letter to be the same on each node. GSimply choosing "A" was easier and more deterministic than negotiating @the controller letter between the nodes, and also parallels the Gsolution used for this situation when DSSI or SDI/STI storage was used.

ETo enable port allocation classes, see the SYSBOOT command SET/BOOT, +and see the DEVICE_NAMING system parameter.

>This information is also described in the Cluster Systems and 6Guidelines for OpenVMS Cluster Configurations manuals.P

15.6.3 Tell me about SET HOST/DUP and SET HOST/HSC



CThe OpenVMS DCL commands SET HOST/DUP and SET HOST/HSC are used to Hconnect to storage controllers via the Diagnostics and Utility Protocol A(DUP). These commands require that the FYDRIVER device driver be Cconnected. This device driver connection is typically performed by @adding the following command(s) into the system startup command procedure:

On OpenVMS Alpha:

 

"
$ RUN SYS$SYSTEM:SYSMAN 9SYSMAN> IO CONNECT FYA0/NOADAPTER/DRIVER=SYS$FYDRIVER 




On OpenVMS VAX:

 

"
$ RUN SYS$SYSTEM:SYSGEN "SYSGEN> CONNECT FYA0/NOADAPTER 




EAlternatives to the DCL SET HOST/DUP command include the console SET FHOST command available on various mid- to recent-vintage VAX consoles:

4Access to Parameters on an Embedded DSSI controller:

 

"
6SET HOST/DUP/DSSI[/BUS:{0:1}] dssi_node_number PARAMS 




<Access to Directory of tools on an Embedded DSSI controller:

 

"
6SET HOST/DUP/DSSI[/BUS:{0:1}] dssi_node_number DIRECT 




0Access to Parameters on a KFQSA DSSI controller:

 

"
2SHOW UQSSP ! to get port_controller_number PARAMS 1SET HOST/DUP/UQSSP port_controller_number PARAMS 




EThese console commands are available on most MicroVAX and VAXstation B3xxx series systems, and most (all?) VAX 4xxx series systems. For Dfurther information, see the system documentation and---on most VAX $systems---see the console HELP text.

FEK-410AB-MG, _DSSI VAXcluster Installation and Troubleshooting_, is a Egood resource for setting up a DSSI VMScluster on OpenVMS VAX nodes. H(This manual predates coverage of OpenVMS Alpha systems, but gives good >coverage to all hardware and software aspects of setting up a FDSSI-based VMScluster---and most of the concepts covered are directly Eapplicable to OpenVMS Alpha systems. This manual specifically covers Ethe hardware, which is something not covered by the standard OpenVMS VMScluster documentation.)

kAlso see Section 15.3.3, and for the SCS name of the OpenVMS host see 0Section 5.7.K

15.6.4 How do I rename a DSSI disk (or tape?)



AIf you want to renumber or rename DSSI disks or DSSI tapes, it's ,easy---if you know the secret incantation...

From OpenVMS:

 

"
$ RUN SYS$SYSTEM:SYSGEN "SYSGEN> CONNECT FYA0/NOADAPTER SYSGEN> ^Z @$ SET HOST/DUP/SERV=MSCP$DUP/TASK=PARAMS <DSSI-NODE-NAME> ... PARAMS> STAT CONF F<The software version is normally near the top of the display.> PARAMS> EXIT ... 




EFrom the console on most 3000- and 4000-class VAX system consoles... <(Obviously, the system must be halted for these commands...)

Integrated DSSI:

 

"
6SET HOST/DUP/DSSI[/BUS:[0:1]] dssi_node_number PARAMS 




KFQSA:

 

"
1SET HOST/DUP/UQSSP port_controller_number PARAMS 




FFor information on how to get out into the PARAMS subsystem, also see Hthe HELP at the console prompt for the SET HOST syntax, or see the HELP @on SET HOST /DUP (once you've connected FYDRIVER under OpenVMS).

EOnce you are out into the PARAMS subsystem, you can use the FORCEUNI Coption to force the use of the UNITNUM value and then set a unique DUNITNUM inside each DSSI ISE---this causes each DSSI ISE to use the Cspecfied unit number and not use the DSSI node as the unit number. FOther parameters of interest are NODENAME and ALLCLASS, the node name 0and the (disk or tape) cluster allocation class.

FEnsure that all disk unit numbers used within an OpenVMS Cluster disk Fallocation class are unique, and all tape unit numbers used within an FOpenVMS Cluster tape allocation class are also unique. For details on jthe SCS name of the OpenVMS host, see Section 5.7. For details of SET BHOST/DUP, see Section 15.6.3.]

15.6.5 Where can I get Fibre Channel Storage (SAN) information?

^

15.6.6 Which files must be shared in an OpenVMS Cluster?



DThe following files are expected to be common across all nodes in a Dcluster environment, and though SYSUAF is very often common, it can Falso be carefully coordinated---with matching UIC values and matching Dbinary identifier values across all copies. (The most common use of Gmultiple SYSUAF files is to allow different quotas on different nodes. EIn any event, the binary UIC values and the binary identifier values @must be coordinated across all SYSUAF files, and must match the HRIGHTSLIST file.) In addition to the list of files (and directories, in lsome cases) shown in Table 15-1, please review the VMScluster ;documentation, and the System Management documentation.

j  # 0                                                                              
Table 15-1 Cluster Common Shared Files
Filename Default Specification
SYSUAF  SYS$SYSTEM:.DAT
 SYSUAFALT  SYS$SYSTEM:.DAT
SYSALF  SYS$SYSTEM:.DAT
 RIGHTSLIST  SYS$SYSTEM:.DAT
 NETPROXY  SYS$SYSTEM:.DAT
 NET$PROXY  SYS$SYSTEM:.DAT
 NETOBJECT  SYS$SYSTEM:.DAT
 NETNODE_REMOTE  SYS$SYSTEM:.DAT
 QMAN$MASTER 1 SYS$SYSTEM:; this is a set of related files
 LMF$LICENSE  SYS$SYSTEM:.LDB
 VMSMAIL_PROFILE  SYS$SYSTEM:.DATA
 VMS$OBJECTS  SYS$SYSTEM:.DAT
 VMS$AUDIT_SERVER  SYS$MANAGER:.DAT
 VMS$PASSWORD_HISTORY  SYS$SYSTEM:.DATA
 NETNODE_UPDATE  SYS$MANAGER:.COM
 VMS$PASSWORD_POLICY  SYS$LIBRARY:.EXE
 LAN$NODE_DATABASE  SYS$SYSTEM:.DAT
 VMS$CLASS_SCHEDULE  SYS$SYSTEM:.DATA
 SYS$REGISTRY 1 SYS$SYSTEM:; this is a set of related files


FIn addition to the documentation, also see the current version of the %file SYS$STARTUP:SYLOGICALS.TEMPLATE.>Specifically, please see the most recent version of this file -available, starting on or after OpenVMS V7.2.

CA failure to have common or (in the case of multiple SYSUAF files) Fsynchronized files can cause problems with batch operations, with the BSUBMIT/USER command, with the general operations with the cluster >alias, and with various SYSMAN and related operations. Object Fprotections and defaults will not necessarily be consistent, as well. EThis can also lead to system security problems, including unintended Daccess denials and unintended object accesses, should the files and ?particularly should the binary identifier values become skewed.




 r Y \ ^  
PreviousNextContentsIndex