From:	CRDGW2::CRDGW2::MRGATE::"SMTP::CRVAX.SRI.COM::RELAY-INFO-VAX" 17-APR-1989 10:48
To:	MRGATE::"ARISIA::EVERHART"
Subj:	Thoughts On The Application Of The DNS/DFS Products...

Received: From KL.SRI.COM by CRVAX.SRI.COM with TCP; Mon, 17 APR 89 06:41:19 PDT
Received: from remote.dccs.upenn.edu by KL.SRI.COM with TCP; Mon, 17 Apr 89 06:17:53 PDT
Received: from XRT.UPENN.EDU by remote.dccs.upenn.edu
	id AA08278; Fri, 14 Apr 89 13:46:02 EDT
Message-Id: <8904141746.AA08278@remote.dccs.upenn.edu>
Date: Fri, 14 Apr 89 13:47 EDT
From: "Clayton, Paul D." <CLAYTON@xrt.upenn.edu>
Subject: Thoughts On The Application Of The DNS/DFS Products...
To: INFO-VAX@KL.SRI.COM
X-Vms-To: @INFOVAX,CLAYTON


There are two products, made and sold by Digital Equipment Corporation
that I am using which as been the target of recent postings. They are both 
saving me considerable time and effort. They are the DNS and DFS products. The 
problem I am having deals with disk accesses from remote systems, as described 
below.
-----------------

The disk problem is complicated, but will be overly simplified, to show
what is going on. There will be up to eighteen (18) MICs (Mixed Interconnect 
Clusters), with up to forty-two (42) nodes each, communicating to dual HSC's 
with disks and tapes attached. Add into this hardware mixture the following 
facts and constraints.

1. Most users have a workstation, of one type or another, which is a
member of one of these MI clusters. Movement of workstations from one
cluster to another will ONLY be done for those people changing work
areas.

2. Considerable amounts of software, both DEC and third party, is
licensed for certain systems, and not all of them. In some cases,
software is licensed for a 8550 boot node member, but not the
workstations running off the same MIC.

3. Some software is licensed for one MIC, but due to the 42 node limit,
and other considerations, the users of the product may not be on the MIC
that has the software licensed to run. But these same users must be able
to get to the software in order to perform their job function.

4. In some cases, user groups (those doing the same type of function, be
it development of a sub-system or something else) may or may not be on
the same cluster due to disk storage constraints, 42 node limits or group
breakdowns.

5. All systems are on the same Ethernet, and therefore able to talk with
one another freely.

6. Certain application packages do not allow for node names to be
specified in a file specification which would have allowed 'normal'
DECnet access to files on other MICs.

7. 'Normal' DECnet file transfers are not very fast due to the various
layers of DECnet and associated software. This is in reference to
throughput versus time versus record size.

8. Multiple usernames on different MICs is not attractive due to having
different disks that the same users work is then spread over. The case
always comes up that the same user wants to take a file from one
application and use it for another application. Due to the license
problems, these could be different MICs the user has to be on.

With the above items known, the list of agreed upon issues was determined
and are listed next.

1. The license issue is based on which processor the program is
'executed' on. Many packages here have keys based on the system id
register (SID) or Ethernet address of the processor which is legally
allowed to run the program(s). No mention is made of where the input and
output data files are located. The bottom line is that if the node name
problem would be worked around, users with accounts on MIC 'C' could log
into MIC 'A' and run some program with the files used being those of the
users account on MIC 'C'. 

2. Having the same username on multiple MICs, with ALL UAF fields, except
the device and directory specification being the same, is allowed for.
The extra management overhead will need to be taken care of with
manpower.

Considering all the above, there are two products that work with each
other to provide the needed abilities. They are DNS (Distributed Name
Service) and DFS (Distributed File Service). Having DNS is a requirement
of DFS. A brief description of the two products follows.

DNS: This product 'lives' in the network and provides a means to store
and retrieve common information from any node in the network. The
information is stored in a 'name_space', and accessed through a
'clearing_house'. The name space can be thought of as a 'group' of names
and associated equivalence strings. The clearing house is the processor
which is executing the DNS software to maintain a specific name space. In
order to survive single node failure, DNS is setup to have another node
in the network, be a 'clearing_house' for the same name_space, in a Read-
Only mode. This method of surviving node failure is drastically different
then the 'standard' cluster rule of having the same software and
information available on another node in the same VAXcluster. It is
recommended for DNS that the RO version clearing house be a processor not
in the same cluster. It should also be pointed out that the nodes for
both clearing houses do NOT have to be cluster members, it just happens
that I have nothing BUT cluster members to work with. Various
applications, like DFS, can 'query' the clearing house with a certain
'name' and DNS will return the equivalence string.

DFS: This product 'lives' on each processor that needs access to disks
other then ones in its own cluster, or directly attached. There are two
version's of DFS. The first is the CLIENT kit which allows a node to gain
access to disks on another node, as if they were locally attached. The
second, is the SERVER kit which would make local disks be available to
nodes elsewhere on the network which could not gain access to them
through cluster membership or local attachments.


The specific method of setting up the MIC's was to insure that each
cluster followed certain rules. The following rules where defined by myself
and not a requirement of DFS or DNS. Most of the rules were defined to 
reduce the amount of confusion this problem creates.

1. Only the CI boot node processors will have a DECnet cluster alias
defined as 'XCLUS' where 'X' is a letter of the alphabet starting with
the letter 'A' and denotes a specific cluster.

2. The disks will have volume labels of the form 'XUSERN', where 'X' is
the cluster identifier and 'N' is the logical unit number for the disk.
Logical names are defined at system level to mirror the volume labels.
The system disk label is left as 'VMSRL5' but a logical name of 'XUSER0'
is defined during the boot process.

3. On each boot node only, an 'access point' is defined to DNS for each
of the available disks on that cluster. An access point is a disk and
directory specification that DFS on another node can reference in a MOUNT
command. The access points are broken down by cluster in the following
format.
  $DFSCP ADD ACCESS_POINT XCLUS.XUSERN XUSERN:[000000]/CLUSTER 
where DFSCP is the DFS application control program, much like NCP is to
DECnet. The 'X' variable is the cluster desiginator and the 'N' is the
volume number. What this allows for is DFS on other nodes to mount this
access point and gain access to the disk as if the entire disk structure
was directly available through MSCP. Note here that there is nothing
stopping you from setting an access point to be a user directory and NOT
the MFD. The knowledge of what each access point refers to is assumed to
be known and understood when anyone references it. The '/CLUSTER' switch
allows for node failure of the boot nodes. Any disks mounted through DFS
will enter mount verification and continue on a surviving node, if one is
available with the same cluster alias. Only the boot nodes have the
cluster alias defined to prevent satellites from getting DFS connections
from other nodes. 

4. On any node, this includes boot nodes and satellites, wanting access
to another clusters disks, the following DFS command 'mounts' the other
nodes disk as a local disk.
   $DFSCP MOUNT/SYSTEM XCLUS.XUSERN XUSERN/VOLUME=XUSERN 
The result is a disk which a 'SHOW DEVICE/MOUNTED XUSERN' command would
yield the following output.

Device                Device           Error    Volume   Free  Trans Mnt
 Name                 Status           Count    Label   Blocks Count Cnt
DFSC1001: (node_name) Mounted              0    XUSERN  ******     2   1

If a 'SHOW DEVICE/FULL DFSC1001' command is done, the listing shows it to
be an RA-82 device, on-line and mounted, shareable and served to the
cluster via MSCP. The specific line item information is that for an
'normal' RA-82 disk. It is my belief that the device info is that associated 
with the particular device being servered, and would therefore change 
depending on the device type. I have not proven this, as I have nothing but 
RA-82 disks. The misleading entry here is that the device is NOT
served to the cluster via MSCP, and in fact can not be 'SET SERVED'. The
result is a need to 'mount' the same access point on every node that
needs access to a particular remote disk. The mount can be done at SYSTEM
level, and therefore available to anyone on that specific processor, or
at process or GROUP level for restricted access.

Since the access point, for my purpose, is made at the MFD level, the 'normal'
directory specifications work such as XUSERN:[USER.FOOBAR].

Items to note.

1. The speed of the access is VERY good when compared to normal DECnet
transfers. Looking at VS3100's screens it might be difficult to tell the
difference between normal LAVc disk speed and DFS. This is possible even
though DECnet links are used by DFS to pass data. It should be noted that
the DFS product installation kit patches a number of system images and
requires a reboot in order to function. One can only assume that the
patches are allowing backdoor, and much faster, paths through DECnet and
what have you. The speed does rely on the load, and size of the processor
serving as the DFS server node. If there is a maxed out 8250 as a DFS
server node, then the transfers are not going to be that good, but then
neither is the LAVc responsiveness.

2. The ability to have a SINGLE, shared, SYSUAF file was attempted to
help eliminate the management headaches. It did not, and will not work. 

3. Control of file access over DFS is handled through the DECnet PROXY
mechanism. This means that appropriate entries have to be made in the
proxy database using the AUTHORIZE command to allow for a user on one
system to access files on another using DFS. This can also add to a
management headache depending on the network environment.

4. There are restrictions on the types of I/O operations that can be
performed over DFS. It boils down to no write sharing of files, and only
virtual I/O operations. Logical and physical I/O operations are not
allowed. You can use DFS with the BACKUP command for storage of save sets
on remote disks, as well as using a DFS device in a BACKUP output qualifier in 
order to copy directory structures from one node to another. Without DFS, you 
had to specify a save set on the remote system then 'unpack' the save set on 
the remote system. You can also create directories, given the proper
protections and privileges.

5. There is a single DECnet link between the client system, who has the
disk mounted, and the server system, who has the disk directly attached
by HSC or direct connect. If multiple disks are being 'served' by the
same server, to the same client, only one DECnet link is used. These
links can add up to substantial numbers. It is incumbent on the system
manager to manage the DECnet executor database for the number of
allowable links 'in-bound' to the server nodes. Of particular interest is
to insure that sufficient resources are made available for both the
cluster alias and processor specific connections. This is done through
the following two commands.
  $MCR NCP (DEFINE/SET) EXECUTOR ALIAS MAXIMUM LINKS nnn
  $MCR NCP (DEFINE/SET) EXECUTOR MAXIMUM LINKS yyy 
The number used for the cluster alias command should allow for all the
links to reside on a single processor. This will allow for node failure
in the cluster without having link rejects occur.

6. There are control mechanisms in the DFS control program (DFSCP) for
such things as the number of buffers to make available, how many requests
can be outstanding at any one time, how many 'connections' will be served
at one time, etc.. These values will require monitoring and modification
as the user environment changes.

Hope this helps gives some insight into how the product set works.

pdc

Paul D. Clayton 
Address - CLAYTON%XRT@RELAY.UPENN.EDU

Disclaimer:  All thoughts and statements here are my own and NOT those of my 
employer, and are also not based on, or contain, restricted information.