f. OpenVMS FAQ -&- page 27)b @5z

HP OpenVMS Systems Documentation

 q> $"b,
Content starts here"D

The OpenVMS Frequently Asked Questions (FAQ)


 l n  
PreviousContentsIndex

U

15.6.1 OpenVMS Cluster Communications Protocol Details?



@The following sections contain information on the OpenVMS System>Communications Services (SCS) Protocol. Cluster terminology is\available in Section 15.6.1.2.1.S

15.6.1.1 OpenVMS Cluster (SCS) over DECnet? Over IP?



=The OpenVMS Cluster environment operates over various networkDprotocols, but the core of clustering uses the System CommunicationsDServices (SCS) protocols, and SCS-specific network datagrams. Direct(full) connectivity is assumed.

FAn OpenVMS Cluster does not operate over DECnet, nor over IP.

&No SCS protocol routers are available.

BMany folks have suggested operating SCS over DECnet or IP over theByears, but SCS is too far down in the layers, and any such projectGwould entail a major or complete rewrite of SCS and of the DECnet or IPFdrivers. Further, the current DECnet and IP implementations have largeDtracts of code that operate at the application level, while SCS must?operate in the rather more primitive contexts of the system andFparticularly the bootstrap---to get SCS to operate over a DECnet or IPFconnection would require relocating major portions of the DECnet or IPFstack into the kernel. (And it is not clear that the result would even-meet the bandwidth and latency expectations.)

@The usual approach for multi-site OpenVMS Cluster configurationsGinvolves FDDI, Memory Channel (MC2), or a point-to-point remote bridge,Cbrouter, or switch. The connection must be transparent, and it mustBoperate at 10 megabits per second or better (Ethernet speed), withFlatency characteristics similar to that of Ethernet or better. Various4sites use FDDI, MC2, ATM, or point-to-point T3 link.W

15.6.1.2 Configuring Cluster SCS for path load balancing?



>This section discusses OpenVMS Cluster communications, clusterCterminology, related utilities, and command and control interfaces.D

15.6.1.2.1 Cluster Terminology?



ESCS: Systems Communication Services. The protocol used to communicateDbetween VMSCluster systems and between OpenVMS systems and SCS-basedEstorage controllers. (SCSI-based storage controllers do not use SCS.)

GPORT: A communications device, such as DSSI, CI, Ethernet or FDDI. EachF CI or DSSI bus is a different local port, named PAA0, PAB0, PAC0 etc.9 All Ethernet and FDDI busses make up a single PEA0 port.

EVIRTUAL CIRCUIT: A reliable communications path established between aG pair of ports. Each port in a VMScluster establishes a virtual circuit' with every other port in that cluster.

CAll systems and storage controllers establish "Virtual Circuits" to;enable communications between all available pairs of ports.

SYSAP: A "system application"G that communicates using SCS. Each SYSAP communicates with a particular& remote SYSAP. Example SYSAPs include:

(VMS$DISK_CL_DRIVER connects to MSCP$DISK
4The disk class driver is on every VMSCluster system.E MSCP$DISK is on all disk controllers and all VMSCluster systems that) have SYSGEN parameter MSCP_LOAD set to 1

(VMS$TAPE_CL_DRIVER connects to MSCP$TAPE
4The tape class driver is on every VMSCluster system.E MSCP$TAPE is on all tape controllers and all VMSCluster systems that* have SYSGEN parameter TMSCP_LOAD set to 1

)VMS$VAXCLUSTER connects to VMS$VAXCLUSTER
AThis SYSAP contains the connection manager, which manages cluster>connectivity, runs the cluster state transition algorithm, andEimplements the cluster quorum algorithm. This SYSAP also handles lock<traffic, and various other cluster communications functions.

(SCS$DIR_LOOKUP connects to SCS$DIRECTORY
3This SYSAP is used to find SYSAPs on remote systems

MSCP and TMSCP
FThe Mass Storage Control Protocol and the Tape MSCP servers are SYSAPsFthat provide access to disk and tape storage, typically operating overFSCS protocols. MSCP and TMSCP SYSAPs exist within OpenVMS (for OpenVMSAhosts serving disks and tapes), within CI- and DSSI-based storageFcontrollers, and within host-based MSCP- or TMSCP storage controllers.GMSCP and TMSCP can be used to serve MSCP and TMSCP storage devices, andCcan also be used to serve SCSI and other non-MSCP/non-TMSCP storagedevices.

DSCS CONNECTION: A SYSAP on one node establishes an SCS connection toCits counterpart on another node. This connection will be on ONE AND+ONLY ONE of the available virtual circuits.O

15.6.1.2.2 Cluster Communications Control?



GWhen there are multiple virtual circuits between two OpenVMS systems itFis possible for the VMS$VAXCLUSTER to VMS$VAXCLUSTER connection to useCany one of these circuits. All lock traffic between the two systems1will then travel on the selected virtual circuit.

DEach port has a "LOAD CLASS" associated with it. This loadEclass helps to determine which virtual circuit a connection will use.GIf one port has a higher load class than all others then this port willEbe used. If two or more ports have equally high load classes then the>connection will use the first of these that it finds. Prior toDenhancements found in V7.3-1 and later, the load class is static andFnormally all CI and DSSI ports have a load class of 14(hex), while theEEthernet and FDDI ports will have a load class of A(hex). With V7.3-1-and later, the load class values are dynamic.

?For instance, if you have multiple DSSI busses and an FDDI, theFVMS$VAXCLUSTER connection will chose the DSSI bus as this path has theGsystem disk, and thus will always be the first DSSI bus discovered whenthe OpenVMS system boots.

>To force all lock traffic off the DSSI and on to the FDDI, forCinstance, an adjustment to the load class value is required, or theDSSI SCS port must be disabled.

>In addition to the load class mechanisms, you can also use theF"preferred path" mechanisms of MSCP and TMSCP services. ThisFallows you to control the SCS connections used for serving remote diskGand tape storage. The preferred path mechanism is most commonly used to@explicitly spread cluster I/O activity over hosts and/or storageAcontrollers serving disk or tape storage in parallel. This can beEparticularly useful if your hosts or storage controllers individuallyClack the necessary I/O bandwidth for the current I/O load, and must7thus aggregate bandwidth to serve the cluster I/O load.

DFor related tools, see various utilities including LAVC$STOP_BUS andBLAVC$START_BUS, and see DCL commands including SET PREFERRED_PATH.c

15.6.1.2.3 Cluster Communications Control Tools and Utilities?



0In most OpenVMS versions, you can use the tools:

  • SYS$EXAMPLES:LAVC$STOP_BUS!
  • SYS$EXAMPLES:LAVC$START_BUS


EThese tools permit you to disable or enable all SCS traffic on the onthe specified paths.

EYou can also use a preferred path mechanism that tells the local MSCPAdisk class driver (DUDRIVER) which path to a disk should be used.CGenerally, this is used with dual-pathed disks, forcing I/O trafficEthrough one of the controllers instead of the other. This can be usedFto implement a crude form of I/O load balancing at the disk I/O level.

8Prior to V7.2, the preferred path feature uses the tool:

    
  • SYS$EXAMPLES:PREFER.MAR


AIn OpenVMS V7.2 and later, you can use the following DCL command:

 

"
$ SET PREFERRED_PATH




GThe preferred path mechanism does not disable nor affect SCS operationson the non-preferred path.

EWith OpenVMS V7.3 and later, please see the SCACP utility for control>over cluster communications, SCS virtual circuit control, portselection, and related.N

15.6.2 Cluster System Parameter Settings?



EThe following sections contain details of configuring cluster-relatedsystem parameters.d

15.6.2.1 What is the correct value for EXPECTED_VOTES in a VMScluster?



FThe VMScluster connection manager uses the concept of votes and quorumGto prevent disk and memory data corruptions---when sufficient votes are?present for quorum, then access to resources is permitted. WhenDsufficient votes are not present, user activity will be blocked. TheFact of blocking user activity is called a "quorum hang", and is betterBthought of as a "user data integrity interlock". This mechanism isGdesigned to prevent a partitioned VMScluster, and the resultant massiveEdisk data corruptions. The quorum mechanism is expressly intended;to prevent your data from becoming severely corrupted.

DOn each OpenVMS node in a VMScluster, one sets two values in SYSGEN:@VOTES, and EXPECTED_VOTES. The former is how many votes the nodeFcontributes to the VMScluster. The latter is the total number of votes2expected when the full VMScluster is bootstrapped.

GSome sites erroneously attempt to set EXPECTED_VOTES too low, believingFthat this will allow when only a subset of voting nodes are present in;a VMScluster. It does not. Further, an erroneous setting inEEXPECTED_VOTES is automatically corrected once VMScluster connections>to other nodes are established; user data is at risk of severeBcorruptions during the earliest and most vulnerable portion of the?system bootstrap, before the connections have been established.

FOne can operate a VMScluster with one, two, or many voting nodes. WithAany but the two-node configuration, keeping a subset of the nodesGactive when some nodes fail can be easily configured. With the two-nodeDconfiguration, one must use a primary-secondary configuration (whereGthe primary has all the votes), a peer configuration (where when eitherEnode is down, the other hangs), or (preferable) a shared quorum disk.

FUse of a quorum disk does slow down VMScluster transitions somewhat --Ethe addition of a third voting node that contributes the vote(s) thatGwould be assigned to the quorum disk makes for faster transitions---butAthe use of a quorum disk does mean that either node in a two-nodeAVMScluster configuration can operate when the other node is down.



/  
Note

DThe quorum disk must be on a non-host-based shadowed disk, though itFcan be protected with controller-based RAID. Because host-based volumeEshadowing depends on the lock manager and the lock manager depends onGthe connection manager and the connection manager depends on quorum, itFis not technically feasible (nor even particularly reliable) to permit?host-based volume shadowing to protect the quorum disk.


<If you choose to use a quoum disk, a QUORUM.DAT file will be automaticallyGcreated when OpenVMS first boots and when a quorum disk is specified --@well, the QUORUM.DAT file will be created when OpenVMS is booted4without also needing the votes from the quorum disk.

FIn a two-node VMScluster with a shared storage interconnect, typically>each node has one vote, and the quorum disk also has one vote.EXPECTED_VOTES is set to three.

EUsing a quorum disk on a non-shared interconnect is unnecessary---theGuse of a quorum disk does not provide any value, and the votes assignedAto the quorum disk should be assigned to the OpenVMS host servingaccess to the disk.

CFor information on quorum hangs, see the OpenVMS documentation. ForEinformation on changing the EXPECTED_VOTES value on a running system,Esee the SET CLUSTER/EXPECTED_VOTES command, and see the documentationGfor the AMDS and Availability Manager tools. Also of potential interestFis the OpenVMS system console documentation for the processor-specificCconsole commands used to trigger the IPC (Interrrupt Priority LevelG%x0C; IPL C) handler. (IPC is not available on OpenVMS I64 V8.2.) AMDS,EAvailability Manager, and the IPC handler can each be used to clear a>quorum hang. Use of AMDS and Availability Manager is generally@recommended over IPC, particularly because IPC can cause CLUEXITG bugchecks if the system should remain halted beyond the cluster sanityH timer limits, and because some Alpha consoles and most (all?) Integrity/ consoles do not permit a restart after a halt.

CThe quorum scheme is a set of "blade guards" deliberatelyEimplemented by OpenVMS Engineering to provide data integrity---removeFthese blade guards at your peril. OpenVMS Engineering did notCimplement the quorum mechanism to make a system manager's life moreEdifficult--- the quorum mechanism was specifically implemented to+keep your data from getting scrambled./S

15.6.2.1.1 Why no shadowing for a Quorum Disk?

gt/"igd

bDStated simply, Host-Based Volume Shadowing uses the Distributed LockFManager (DLM) to coordinate changes to membership of a shadowset (e.g.Eremoving a member). The DLM depends in turn on the Connection ManageriBenforcing the Quorum Scheme and deciding which node(s) (and quorumCdisk) are participating in the cluster, and telling the DLM when it>Gneeds to do things like a lock database rebuild operation. So you can'teGintroduce a dependency of the Connection Manager on Shadowing to try to=>pick proper shadowset member(s) to use as the Quorum Disk whenFShadowing itself is using the DLM and thus indirectly depending on theCConnection Manager to keep the cluster membership straight---it's a=circular dependency.

dESo in practice, folks simply depend on controller-based mirroring (orCGcontroller-based RAID) to protect the Quorum Disk against disk failuresl@(and dual-redundant controllers to protect against most cases ofFcontroller and interconnect failures). Since this disk unit appears toEbe a single disk up at the VMS level, there's no chance of ambiguity.tX

15.6.2.2 Explain disk (or tape) allocation class settings?

i

7EEach disk (or tape) unit number used within a non-zero disk (or tape)aDallocation class must be unique, regardless of the particular deviceEprefix. For the purposes of multi-path device path determination, any1Edisk (or tape) device with the same unit number and the same disk (or Ftape) allocation class configuration is assumed to be the same device.

FIf you are reconfiguring disk device allocation classes, you will wantDto avoid the use of allocation class one ($1$) until/unless you haveEFibre Channel storage configured. (Fibre Channel storage specificallyr8requires the use of allocation class $1$. eg: $1$DGA0:.)a

15.6.2.2.1 How to configure allocation classes and Multi-Path SCSI?

tp

wEThe HSZ allocation class is applied to devices, starting with OpenVMSnDV7.2. It is considered a port allocation class (PAC), and all deviceGnames with a PAC have their controller letter forced to "A". (You mighte>infer from the the text in the "Guidelines for OpenVMS ClusterEConfigurations" that this is something you have to do, though OpenVMSn0will thoughtfully handle this renaming for you.)

e=You can force the device names back to DKB by setting the HSZnFallocation class to zero, and setting the PKB PAC to -1. This will useEthe host allocation class, and will leave the controller letter alone2D(that is, the DK controller letter will be the same as the SCSI portG(PK) controller). Note that this won't work if the HSZ is configured in Fmultibus failover mode. In this case, OpenVMS requires that you use anallocation class for the HSZ.s

BWhen your configuration gets even moderately complex, you must payAcareful attention to how you assign the three kinds of allocationaBclass: node, port and HSZ/HSJ, as otherwise you could wind up with7device naming conflicts that can be painful to resolve.d

iEThe display-able path information is for SCSI multi-path, and permits>Ethe multi-path software to distinguish between different paths to theFsame device. If you have two paths to $1$DKA100, for example by havingGtwo KZPBA controllers and two SCSI buses to the HSZ, you would have twoe=UCBs in a multi-path set. The path information is used by the:multi-path software to distinguish between these two UCBs.

.FThe displayable path information describes the path; in this case, theCSCSI port. If port is PKB, that's the path name you get. The deviceeGname is no longer completely tied to the port name; the device name noweCdepends on the various allocation class settings of the controller,oSCSI port or node.

eDThe reason the device name's controller letter is forced to "A" when?you use PACs is because a shared SCSI bus may be configured via Gdifferent ports on the various nodes connected to the bus. The port maylDbe PKB on one node, and PKC on the other. Rather obviously, you willGwant to have the shared devices use the same device names on all nodes.xATo establish this, you will assign the same PAC on each node, andeEOpenVMS will force the controller letter to be the same on each node.MFSimply choosing "A" was easier and more deterministic than negotiating?the controller letter between the nodes, and also parallels the=Gsolution used for this situation when DSSI or SDI/STI storage was used.u

sDTo enable port allocation classes, see the SYSBOOT command SET/BOOT,+and see the DEVICE_NAMING system parameter.nn 

a=This information is also described in the Cluster Systems and"6Guidelines for OpenVMS Cluster Configurations manuals.P

15.6.3 Tell me about SET HOST/DUP and SET HOST/HSC

x

rOn OpenVMS Alpha:c

 
s
"
$ RUN SYS$SYSTEM:SYSMAN8SYSMAN> IO CONNECT FYA0/NOADAPTER/DRIVER=SYS$FYDRIVER

S


oOn OpenVMS VAX:e

e 
a
"
$ RUN SYS$SYSTEM:SYSGENw!SYSGEN> CONNECT FYA0/NOADAPTERe

a


aDAlternatives to the DCL SET HOST/DUP command include the console SETFHOST command available on various mid- to recent-vintage VAX consoles:

o4Access to Parameters on an Embedded DSSI controller:

n 
N
"
5SET HOST/DUP/DSSI[/BUS:{0:1}] dssi_node_number PARAMSn

N


h<Access to Directory of tools on an Embedded DSSI controller:

2 
x
"
5SET HOST/DUP/DSSI[/BUS:{0:1}] dssi_node_number DIRECTt

s


0Access to Parameters on a KFQSA DSSI controller:

e 
w
"
1SHOW UQSSP ! to get port_controller_number PARAMSC0SET HOST/DUP/UQSSP port_controller_number PARAMS

w


uDThese console commands are available on most MicroVAX and VAXstationA3xxx series systems, and most (all?) VAX 4xxx series systems. ForeCfurther information, see the system documentation and---on most VAXt$systems---see the console HELP text.

uEEK-410AB-MG, _DSSI VAXcluster Installation and Troubleshooting_, is anDgood resource for setting up a DSSI VMScluster on OpenVMS VAX nodes.G(This manual predates coverage of OpenVMS Alpha systems, but gives gooda=coverage to all hardware and software aspects of setting up a EDSSI-based VMScluster---and most of the concepts covered are directlycDapplicable to OpenVMS Alpha systems. This manual specifically coversDthe hardware, which is something not covered by the standard OpenVMSVMScluster documentation.)

zAlso see Section 15.3.3, and for the SCS name of the OpenVMS host see@Section 5.7.K

15.6.4 How do I rename a DSSI disk (or tape?)

 

@If you want to renumber or rename DSSI disks or DSSI tapes, it's,easy---if you know the secret incantation...

t From OpenVMS:s

l 
e
"
$ RUN SYS$SYSTEM:SYSGEN !SYSGEN> CONNECT FYA0/NOADAPTERs
SYSGEN> ^Z?$ SET HOST/DUP/SERV=MSCP$DUP/TASK=PARAMS <DSSI-NODE-NAME>s...hPARAMS> STAT CONFE<The software version is normally near the top of the display.>sPARAMS> EXITi...V

A


DFrom the console on most 3000- and 4000-class VAX system consoles...<(Obviously, the system must be halted for these commands...)

Integrated DSSI:

h 
t
"
5SET HOST/DUP/DSSI[/BUS:[0:1]] dssi_node_number PARAMS

$


PKFQSA:

$ 
t
"
0SET HOST/DUP/UQSSP port_controller_number PARAMS

t


tEFor information on how to get out into the PARAMS subsystem, also see Gthe HELP at the console prompt for the SET HOST syntax, or see the HELP @on SET HOST /DUP (once you've connected FYDRIVER under OpenVMS).

aDOnce you are out into the PARAMS subsystem, you can use the FORCEUNIBoption to force the use of the UNITNUM value and then set a uniqueCUNITNUM inside each DSSI ISE---this causes each DSSI ISE to use thenBspecfied unit number and not use the DSSI node as the unit number.EOther parameters of interest are NODENAME and ALLCLASS, the node name 0and the (disk or tape) cluster allocation class.

_EEnsure that all disk unit numbers used within an OpenVMS Cluster diskEallocation class are unique, and all tape unit numbers used within annEOpenVMS Cluster tape allocation class are also unique. For details oneythe SCS name of the OpenVMS host, see Section 5.7. For details of SETtRHOST/DUP, see Section 15.6.3.]

15.6.5 Where can I get Fibre Channel Storage (SAN) information?

lte^

15.6.6 Which files must be shared in an OpenVMS Cluster?



fCThe following files are expected to be common across all nodes in aeCcluster environment, and though SYSUAF is very often common, it canbEalso be carefully coordinated---with matching UIC values and matching,Cbinary identifier values across all copies. (The most common use of Fmultiple SYSUAF files is to allow different quotas on different nodes.DIn any event, the binary UIC values and the binary identifier values?must be coordinated across all SYSUAF files, and must match therGRIGHTSLIST file.) In addition to the list of files (and directories, inv{some cases) shown in Table 15-1, please review the VMScluster1;documentation, and the System Management documentation.

j  # V0  s n r S a e b p i a p d  w h  o n i  -  c n e r u s - g o f n o t n  t > > o o u g  s w m p r , a y d o r  d l n f s l l g . d a e o p c  o t h n
Table 15-1 Cluster Common Shared Files
Filename Default Specification
SYSUAF  SYS$SYSTEM:.DATX
 SYSUAFALTd  SYS$SYSTEM:.DATe
SYSALF  SYS$SYSTEM:.DATn
 RIGHTSLIST  SYS$SYSTEM:.DATi
 NETPROXY  SYS$SYSTEM:.DATo
 NET$PROXYe  SYS$SYSTEM:.DAT
 NETOBJECTs  SYS$SYSTEM:.DATt
 NETNODE_REMOTE  SYS$SYSTEM:.DATk
 QMAN$MASTERr 1 SYS$SYSTEM:; this is a set of related files
 LMF$LICENSE  SYS$SYSTEM:.LDBl
 VMSMAIL_PROFILEu  SYS$SYSTEM:.DATA
 VMS$OBJECTSo  SYS$SYSTEM:.DATn
 VMS$AUDIT_SERVER  SYS$MANAGER:.DAT
 VMS$PASSWORD_HISTORY  SYS$SYSTEM:.DATA
 NETNODE_UPDATE  SYS$MANAGER:.COM
 VMS$PASSWORD_POLICYD  SYS$LIBRARY:.EXE
 LAN$NODE_DATABASEw  SYS$SYSTEM:.DAT
 VMS$CLASS_SCHEDULE  SYS$SYSTEM:.DATA
 SYS$REGISTRY 1 SYS$SYSTEM:; this is a set of related filest
neohuL oals leeaaCI

.EIn addition to the documentation, also see the current version of the %file SYS$STARTUP:SYLOGICALS.TEMPLATE.Al=Specifically, please see the most recent version of this fileU-available, starting on or after OpenVMS V7.2.i

yBA failure to have common or (in the case of multiple SYSUAF files)Esynchronized files can cause problems with batch operations, with theoASUBMIT/USER command, with the general operations with the clustera=alias, and with various SYSMAN and related operations. ObjectOEprotections and defaults will not necessarily be consistent, as well.rDThis can also lead to system security problems, including unintendedCaccess denials and unintended object accesses, should the files and.

/


 i l n  h
PreviousNextContentsIndex

 

i#r6dh