8    Troubleshooting RIS

This chapter contains information to help you troubleshoot problems with your RIS system. These problems are grouped into the following categories:

8.1    RIS Lock Files

To prevent multiple users from performing simultaneous operations on RIS areas, the ris utility creates two lock files in the /tmp directory, rislock and ris.tty.lock when you are installing or deleting software in a RIS area. If another user (or the same user on a different terminal) runs the ris utility and attempts to install or delete software from the RIS Utility Main Menu, they see a message similar to the following:

The ris utility is currently locked while j_smith on /dev/ttyp3
is installing software.  Try again later.

If the ris utility is stopped prematurely, these lock files may not be removed and you see this message even though no other user is using RIS. You must delete the lock files from the /tmp directory.

Caution

Before deleting the lock files, ensure that no other user is using the ris utility.

8.2    Client Password Expiration

If the RIS server is using C2 security and the RIS password has been set to allow expiration, it is possible for the RIS clients to be denied service. If the RIS client receives a message similar to the following, the RIS password on the server probably has expired:

Cannot find the name for client using bin/getname. Check with the system
manager of your RIS server

To fix this problem, see Section 3.7.

8.3    Root File System Mounting

RIS uses NFS to mount the root ( / ) file system on the client when booting the client from the RIS server. If you see a message on the RIS client indicating that the root file system cannot be mounted, use the ps -aef | grep mountd command line to see if the NFS mount daemon mountd is running on the server. If mountd is running, you see output similar to the following

# ps -aef | grep mountd
root   308      1 0.0 17:24:28 ??       0:00.02 /usr/sbin/mountd -i -n -n
root  3154  1053  0.0 12:52:55 ttyp3    0:00.00 grep mountd
#

If the mountd daemon is not running, use the SysMan Menu to restart the NFS daemons. If you are running an earlier version of the operating system, use the nfssetup command. See sysman(8) and nfssetup(8) for more information.

The installation media is mounted as the root file system for both CD-ROM and RIS installations, so it is important that the installation media is mounted locally on the server. Due to NFS limitations, RIS cannot provide client access to files that are mounted remotely from another system. The distribution media or extracted RIS area must be available through a local mount point on the RIS server.

8.4    RIS Client Registration

Problems with RIS client registration that are discussed in the following sections include these topics:

8.4.1    No Prompt for Client Hardware Address

The server requires a client's hardware address in order to boot the client over the network. The ris utility prompts you for the client's address during the registration process. If it does not, check the following:

8.4.2    Duplicate Client Hardware Addresses

RIS checks to ensure that no other client has the same hardware address. This can happen if a client's name has changed but has not been removed from the server. If a duplicate hardware address is found, a message is displayed like the one in the following example:

The hardware address provided, nn-nn-nn-nn-nn-nn, has already been 
specified for another client, albany. Please check the hardware address
to ensure it is correct. If it is correct, then you will need to
deregister the client albany before continuing. If this client is not
currently registered, please contact your RIS system administrator.

If you see this message, follow the instructions provided and verify the new hardware address that you entered.

8.4.3    Cloned Client Registration

A CDF is created during a Full Installation. To use the CDF for Installation Cloning, the hardware configuration and the software subsets to load must be substantially similar. Before specifying a CDF for client Installation Cloning, RIS attempts to verify that the subsets specified in the CDF exist in the RIS area that the user has selected. If they do not match, the CDF is rejected. This error can occur if the version numbers of the subset do not match (for example, OSFBASE400 and OSFBASE505).

In the event that a CDF is specified that contains the name of a software subset that is not present in the selected RIS area, you see output similar to the following example:

Enter a set name or press <Return> to exit set selection: rz26.cdf
 
The selected CDF, rz26.cdf, specifies software subsets that are not
present in the selected RIS environment. The missing software subsets are:
OSFSERPC505
 
Please select a different set.

8.4.4    Client Registered on Multiple RIS Servers

If the system will not boot or the system boots but is not able to mount the root file system, you should check to ensure that the RIS client is not registered for BOOTP service on multiple RIS or DMS servers. In order for the BOOTP protocol to work properly, it is important that the client be registered for BOOTP service on only one server. The client is registered for BOOTP service when it is registered for an operating system base product or when it is registered as a DMS client.

It is possible for a RIS client to be registered to two RIS servers at the same time, given they are not both registered for the operating system base product on both servers and attempt to boot their systems using BOOTP.

8.4.5    Client Not in RIS Database

If a message appears on the client's console while you are performing a RIS installation that states that the client is not in the RIS database, look at the following on the server:

8.5    RIS Server Response

Problems with RIS server response comprise several categories. The following section describes:

Boot failures often occur because the RIS server has invalid information. The risdb and bootptab files are involved in handling RIS clients, and you should check them in the order listed:

Caution

A RIS server should run either the bootpd or the joind daemon. A RIS server running both of these daemons is not supported, and results are unpredictable.

8.5.1    Servers Using the bootpd Daemon

A server can respond to BOOTP requests from clients. If the server's information is correct for the client but the server still fails to respond, enable BOOTP message logging on the server :

  1. Edit the server's /etc/inetd.conf file.

  2. Modify the line for bootps to include the -d option as a bootpd command argument. For example:

    bootps  dgram  udp  wait  root  /usr/sbin/bootpd  bootpd -d
     
    

  3. Use the following command to find the process IDs for the Internet daemons. You see output similar to the following:

    # ps x | grep -E "inetd|bootpd"
      228 ??  I      0:00.93 /usr/sbin/inetd
      243 ??  I      0:00.91 /usr/sbin/bootpd
     9134 p2  S      0:00.23 grep -E inetd|bootpd
     
     
    

  4. Send a HUP (hangup) signal to the inetd daemon so it will reread the /etc/inetd.conf configuration file and kill the bootpd daemon. You must kill the inetd daemon before you kill the bootpd daemon. Using the process IDs you identified in the previous step, issue the following kill commands:

    # kill -HUP 228
    # kill -KILL 243
    

It is not necessary to restart the bootpd daemon manually; the inetd daemon starts it automatically.

To track boot requests as they occur, run the tail -f command on the /var/adm/syslog.dated/today's-date /daemon.log file and boot the client. Many daemons other than the bootpd daemon log information to the daemon.log file; however, the log file shows a hardware address that matches the address in the /etc/bootptab file for the client.

If the client's boot requests are not logged, you can enable additional logging by editing the /etc/inetd.conf file, and add a second -d option to the bootpd command. Each additional instance (up to three) of the -d option increases reporting; the second instance enables the server to report all boot requests, even for client systems it does not recognize. This level of reporting should help you determine where in the system the request is being lost.

If you modify the /etc/inetd.conf file, restart the inetd daemon by sending it a HUP signal. Example 8-1 shows a section of a daemon.log file. It shows the data logged by various system daemons, including the bootpd daemon when run with two -d flags set.

Example 8-1:  Sample daemon.log File

Jul 28 14:56:36 stlouis mountd[191]: startup
Jul 28 14:56:38 stlouis xntpd[235]: xntpd version 1.3 [1]
Jul 28 14:56:43 stlouis mold[269]: mold (V1.10) initialization complete
Jul 28 14:56:44 stlouis evd[272]: E003-evd (V1.10) initialization complete
Jul 28 14:56:45 stlouis internet_mom[275]: internet_mom - Initialization
                complete...
Jul 28 14:56:45 stlouis snmp_pe[278]: M004 - snmp_pe (V1.10) initialization
                complete
Jul 28 16:34:55 stlouis inetd[282]: /usr/sbin/bootpd: exit status 0x9 [2]
Jul 28 16:35:47 stlouis bootpd[1228]: bootpd 2.1a #0: \ [3]
                Fri Feb 04 00:32:28 EST 2000
Jul 28 16:35:47 stlouis bootpd[1228]: reading "/etc/bootptab"
Jul 28 16:35:47 stlouis bootpd[1228]: read 3 entries from "/etc/bootptab"
Jul 28 16:35:47 stlouis bootpd[1228]: request from hardware address \ [4]
                nnnnnnnnnnnn
Jul 28 16:36:08 stlouis bootpd[1228]: request from hardware address \ [5]
                nnnnnnnnnnnn
Jul 28 16:36:08 stlouis bootpd[1228]: found: host1.xsamplex.com (nnnnnnnnnnnn)
                at (nn.nn.nnn.nnn)
Jul 28 16:36:08 stlouis bootpd[1228]: file /var/adm/ris/ris0.alpha/\
                vmunix.host1.xsamplex.com
Jul 28 16:36:08 stlouis bootpd[1228]: vendor magic field is 0.0.0.0
Jul 28 16:36:08 stlouis bootpd[1228]: sending RFC1048-style reply

  1. Many daemons log information to this file. [Return to example]

  2. Result of sending a HUP signal to the inetd daemon and killing the bootpd daemon. [Return to example]

  3. A new bootpd daemon starts up in response to a boot request. The bootpd daemon reads the /etc/bootptab file as a part of its startup. [Return to example]

  4. A bootpd request by a system with hardware address nnnnnnnnnnnn. Because the system is not a client of this RIS server, its hardware address is not in the server's /etc/bootptab file. [Return to example]

  5. A bootpd request by a system with hardware address nnnnnnnnnnnn. The system is a client of this RIS server. [Return to example]

8.5.2    Servers Using the joind Daemon

To serve BOOTP requests from clients, the joind daemon, which also services Dynamic Host Configuration Protocol (DHCP) requests, should be running. DHCP enables the automatic assignment of IP address to clients on networks from a pool of addresses. The IP address assignment and configuration occurs automatically whenever client systems (workstations and portable computers) attach to a network. The current implementation of DHCP is based on the JOIN product by Competitive Automation. Ensure that the server's information on the client is correct, namely information contained in the bootptab file of the server as shown in Section 5.1.3. If the server still fails to respond, enable logging of bootp messages on the server by using the following procedure:

  1. Enter the following command to check that the joind daemon is servicing your bootp request:

    # ps -x | grep -E "joind"
    393 ??       I        0:05.82 /usr/sbin/joind
    26446 ttyp0     S +     0:00.01 grep -e joind
    

  2. Enter the following command to determine the current setting of JOIND_FLAGS:

    # rcmgr get JOIND.FLAGS
    

  3. Enter the following command to stop the joind daemon:

    # /sbin/init.d/dhcp stop
    

  4. Enter the following commands to restart the daemon with debugging turned on. Use the JOIND_FLAGS argument to indicate debugging is turned on.

    # rcmgr set JOIND_FLAGS y -dx
       Where x is the level of debugging. A value from 0 to 9 is valid.
       Where y is the previously determined setting of the JOIND_FLAGS.
    # /sbin/init.d dhcp start -dx
    

    Example 8-1 shows a section of a daemon.log file with the data logged by various system daemons, including the joind daemon.

  5. Enter the following commands to turn off debugging:

    # /sbin/init.d/dhcp stop
    # rcmgr set JOIND_FLAGS y
      Where y is the previous determined setting of the JOIND_FLAGS.
    @ determined.
    # /sbin/init.d dhcp start
    

8.5.3    Loading an Incorrect Kernel File

If the server responds but an incorrect kernel is loaded, it is possible that the server's RIS area is configured incorrectly. You can observe the loading process by editing the /etc/inetd.conf file and restarting the Internet daemon as described in the previous section. To do this, add the -d option to the line containing the tftpd command:

tftp	dgram	udp	wait	root	/usr/sbin/tftpd	tftpd -d /tmp /var/adm/ris
 

Logging the server's tftp traffic shows you the file being transferred and the time that the transfer starts and finishes. Ensure that the proper vmunix file is being loaded and that the loading operations are completed correctly.

8.5.4    RIS Server Using the sshd2 daemon

When attempting to perform a setld -l operation from a RIS client to the RIS server, you see a "permission denied" message. If the RIS server and client are using Secure Shell (sshd2), and the client is configured to use secure rutils (rcp, rlogin, rsh, etc), it is necessary for the ris server and client to exchange keys prior to attempting an install operation. This can be done via the ris utility by using the operation documented in Section 6.9. This operation will synchronize the keys between the server and all registered clients.

See the Security Administration manual, ssh2(1) and ssh2_config(4) for further information.