![]() |
![]() |
![]() |
|
|
![]() |
Solving Queue Manager Problems
Topic | For More Information |
---|---|
Avoiding common
problems: a troubleshooting checklist
|
Avoiding Common Problems: A Troubleshooting Checklist
|
If the queue
manager does not start
|
If the Queue Manager Does Not Start
|
If the queuing
system stops or the queue manager does not run on specific nodes
|
If the Queuing System Stops or the Queue Manager Does Not Run on Specific Nodes
|
If the queue
manager becomes unavailable
|
If the Queue Manager Becomes Unavailable
|
If the queuing
system does not work on a specific OpenVMS Cluster node
|
If the Queuing System Does Not Work on a Specific OpenVMS Cluster Node
|
If you see
inconsistent queuing behavior on different OpenVMS Cluster nodes
|
If You See Inconsistent Queuing Behavior on Different OpenVMS Cluster Nodes
|
Reporting a queuing system
problem to HP support representatives
|
Reporting a Queuing System Problem to HP
|
Avoiding Common Problems: A Troubleshooting
Checklist
To avoid the most common queuing system problems, make sure
you have met the following requirements:
Requirement | For More Information |
---|---|
QMAN$MASTER
is identically defined on all nodes in the cluster.
|
Specifying the Location of the Queue Database
|
The queue database
is in the specified location.
|
Specifying the Location of the Queue Database
|
The queue database
disk is mounted and available.
|
Specifying the Location of the Queue Database
|
The node list
specified with the /ON qualifier contains a sufficient number of
nodes. If you specify a node list, HP recommends that you
include an asterisk (*) at the end of the node list.
|
If the Queue Manager Becomes Unavailable
|
The system address parameters
SCSNODE and SCSSYSTEMID match the DECnet for OpenVMS node name and
node ID.
|
If the Queuing System Does Not Work on a Specific OpenVMS Cluster Node
|
If the Queue Manager Does Not Start
If the queue manager
does not start when you enter the START/QUEUE/MANAGER command, the
system displays the following message:
%JBC-E-QMANNOTSTARTED, queue manager could not be started
Investigating
the Problem
Search
the operator log file SYS$MANAGER:OPERATOR.LOG (or look on the operator
console) for messages from the queue manager and job controller
for information about the problem, as follows:
Use the information provided with these messages to further investigate the problem, making sure you have met the requirements listed in Avoiding Common Problems: A Troubleshooting Checklist.$
SEARCH SYS$MANAGER:OPERATOR.LOG/WINDOW=5 QUEUE_MANAGE,-
_$
JOB_CONTROL,BATCH_MANAGE
Cause
The cause of the problem is the system's inability to find
the queue master file. Often the logical is not defined correctly,
or the disk is not available. For example, the following message
indicates that the master queue file does not exist in the expected
location:
%%%%%%%%%%% OPCOM 13-MAR-2000 15:53:52.84 %%%%%%%%%%% Message from user SYSTEM on ABDCEF %JBC-E-OPENERR, error opening SYS$COMMON:[SYSEXE]QMAN$MASTER.DAT %%%%%%%%%%% OPCOM 13-MAR-2000 15:53:53.04 %%%%%%%%%%% Message from user SYSTEM on ABDCEF -SYSTEM-W-NOSUCHFILE, no such file
Correcting the Problem
On systems with multiple queue managers, search for messages
displayed by additional queue managers by including their process
names in the search string. To display information about queue managers
running on your system, use the SHOW QUEUE/MANAGERS command as explained
in
Displaying Information About Queue Managers. Correct
any problem indicated in the displayed information.
$
START/QUEUE/MANAGER DUA55:[SYSQUE]
[1]%JBC-E-QMANNOTSTARTED, queue manager could not be started
[2]$
SEARCH SYS$MANAGER:OPERATOR.LOG /WINDOW=5 QUEUE_MANAGE,JOB_CONTROL
[3]%%%%%%%%%%% OPCOM 14-APR-2000 18:55:18.23 %%%%%%%%%%% Message from user SYSTEM on CATNIP %QMAN-E-OPENERR, error opening DUA55:[SYSQUE]SYS$QUEUE_MANAGER.QMAN$QUEUES; %%%%%%%%%%% OPCOM 14-APR-2000 18:55:18.29 %%%%%%%%%%% Message from user SYSTEM on CATNIP -RMS-F-DEV, error in device name or inappropriate device type for operation %%%%%%%%%%% OPCOM 14-APR-2000 18:55:18.31 %%%%%%%%%%% Message from user SYSTEM on CATNIP -SYSTEM-W-NOSUCHDEV, no such device available
[4]$
START/QUEUE/MANAGER DUA5:[SYSQUE]
[5]
For more information about multiple queue managers and their process names, see Understanding Multiple Queue Managers.
If the Queuing System Stops or the Queue
Manager Does Not Run on Specific Nodes
Use this section if the queue manager does not run on a specific
node in the cluster, or if the queuing system stops, especially
after one of the following actions:
Investigating
the Problem
Check the operator log that was current at the time the queue
manager started up or failed over. Search the log for operator messages
from the queue manager.
On systems with multiple queue managers, also search for messages displayed by additional queue managers by including their process names in the search string. To display information about queue managers running on your system, use the SHOW QUEUE/MANAGERS command, as explained in Displaying Information About Queue Managers.
For more information about multiple queue managers and their process names, see Understanding Multiple Queue Managers.
The following messages indicate that the queue database is not in the specified location:
%%%%%%%%%%% OPCOM 4-FEB-2000 15:06:25.21 %%%%%%%%%%% Message from user SYSTEM on MANGLR %QMAN-E-OPENERR, error opening CLU$COMMON:[SYSEXE]SYS$QUEUE_MANAGER.QMAN$QUEUES; %%%%%%%%%%% OPCOM 4-FEB-2000 15:06:27.29 %%%%%%%%%%% Message from user SYSTEM on MANGLR -RMS-E-FNF, file not found %%%%%%%%%%% OPCOM 4-FEB-2000 15:06:27.45 %%%%%%%%%%% Message from user SYSTEM on MANGLR -SYSTEM-W-NOSUCHFILE, no such fileThe following messages indicate that the queue database disk is not mounted:
%%%%%%%%%%% OPCOM 4-FEB-2000 15:36:49.15 %%%%%%%%%%% Message from user SYSTEM on MANGLR %QMAN-E-OPENERR, error opening DISK888:[QUEUE_DATABASE]SYS$QUEUE_MANAGER.QMAN$QUEUES; %%%%%%%%%%% OPCOM 4-FEB-2000 15:36:51.69 %%%%%%%%%%% Message from user SYSTEM on MANGLR -RMS-F-DEV, error in device name or inappropriate device type for operation %%%%%%%%%%% OPCOM 4-FEB-2000 15:36:52.20 %%%%%%%%%%% Message from user SYSTEM on MANGLR -SYSTEM-W-NOSUCHDEV, no such device available
Cause
The queuing system does not work correctly under the following
circumstances:
In general, the queuing system will be shut off completely if the queue manager encounters a serious error and forces a crash or failover twice in two minutes consecutively on the same node. Therefore, the queuing system may have stopped, or it may continue to run if the queue manager moves to yet another node on which it can access the database after the original failed startup.
Correcting the Problem
Perform the following steps:
If the Queue Manager Becomes Unavailable
The queue manager becomes unavailable if it does not start
or has stopped running.
Investigating the Problem
To investigate the problem, enter SHOW CLUSTER to see if the
nodes on the list are available.
Cause
An insufficient failover node list might have been specified
for the queue manager, so that none of the nodes in the failover
list is available to run the queue manager.
Correcting the Problem
Make sure the queue manager list contains a sufficient number
of nodes by entering START/QUEUE/MANAGER with the /ON qualifier
to specify a node list appropriate for your configuration.
If you are in doubt about what nodes to specify, HP recommends that you specify an asterisk (*) wildcard character as the last node in the list; the asterisk indicates that any remaining node in the cluster can run the queue manager. Specifying the asterisk prevents your queue manager from becoming unavailable because of an insufficient node list.
If the Queuing System Does Not Work on a
Specific OpenVMS Cluster Node
Use this section if the queuing system does not work on a
specific node when it starts up.
Investigating
the Problem
Perform the following steps:
%%%%%%%%%%% OPCOM 4-FEB-2000 15:36:49.15 %%%%%%%%%%% Message from user SYSTEM on ZNFNDL %QMAN-E-COMMERROR, unexpected error #5 in communicating with node CSID 000000 %%%%%%%%%%% OPCOM 4-FEB-2000 15:36:49.15 %%%%%%%%%%% Message from user SYSTEM on ZNFNDL -SYSTEM-F-WRONGACP, wrong ACP for device_
$
RUN SYS$SYSTEM:SYSMAN
SYSMAN>
PARAMETERS SHOW SCSSYSTEMID
Parameter Name Current Default Min. Max. Unit Dynamic -------------- ------- ------- ------- ------- ---- ------- SCSSYSTEMID 19941 0 -1 -1 Pure-numbe
SYSMAN>
PARAMETERS SHOW SCSNODE
Parameter Name Current Default Min. Max. Unit Dynamic -------------- ------- ------- ------- ------- ---- ------- SCSNODE "RANDY " " " " " "ZZZZ" Ascii
SYSMAN>
EXIT
$
RUN SYS$SYSTEM:NCP
NCP>
SHOW EXECUTOR SUMMARY
Node Volatile Summary as of 5-FEB-2000 15:50:36 Executor node = 19.45 (DREAMR) State = on Identification = DECnet for OpenVMS V7.2
NCP>
EXIT
$
WRITE SYS$OUTPUT 19*1024+45
19501
Cause
If the DECnet node name and node ID do not match the SCSNODE
and SCSSYSTEMID system address parameters, IPC (interprocess communication,
an operating system internal mechanism) cannot work properly and
the affected node will not be able to participate in the queuing
system.
Correcting
the Problem
Perform the following steps:
If You See Inconsistent Queuing Behavior
on Different OpenVMS Cluster Nodes
Use this section if you see the following symptoms:
Investigating
the Problem
Perform the following steps:
%%%%%%%%%%% OPCOM 4-FEB-2000 14:41:20.88 %%%%%%%%%%% Message from user SYSTEM on MANGLR %JBC-E-OPENERR, error opening BOGUS:[QUEUE_DIR]QMAN$MASTER.DAT; %%%%%%%%%%% OPCOM 4-FEB-2000 14:41:21.12 %%%%%%%%%%% Message from user SYSTEM on MANGLR -RMS-E-FNF, file not found
Cause
This problem may be caused by different definitions for the
logical name QMAN$MASTER on different nodes in the cluster, causing
multiple queuing environments. You typically find this problem in
OpenVMS Cluster environments when you have just added a system disk
or moved the queuing database.
Correcting the Problem
Perform the following steps:
1 This manual has been archived.
( Number takes you back )
|
|