This chapter provides information on operating system features that are designed to support specific models and classes of AlphaServer processor. It also describes configuration restrictions that are permanent, and specific to certain models of processor. The following information is provided:
Processor-specific information for older platforms (Section 6.1)
Processor-specific information for the AlphaServer TS202c (Section 6.2)
Configuring logical partitions on an AlphaServer GS140 (Section 6.3)
AlphaServer 1000 and 1000A configuration information (Section 6.4)
AlphaServer GS-series configuration information (Section 6.5)
Personal Workstation 433au, 500au, and 600au systems (Section 6.6)
The following notes apply to older systems:
For information about configuring the operating system on Alpha VME single-board computers (SBCs) and PCI/ISA EBMnn modular SBCs, see the System Configuration Supplement: OEM Platforms manual. (The PCI/ISA modular systems and components product family was formerly known as DIGITAL Modular Computing Components, or DMCC.)
Support for the VME bus will be retired in a future release of the operating system. This includes retirement of systems and options that use this bus technology.
For Tru64 UNIX and its software supplements, the supported version of the EISA Configuration Utility (ECU) is Version 1.10 or higher. If your system is configured with an EISA bus, update the ECU to this supported version.
The AlphaServer TS202c system is a dual-processor system that you can configure with up to 16 GB of memory. The system contains two cPCI slots, but is configured without disk storage devices. The system is intended to run a network-bootable standalone system (SAS) kernel.
This section contains information that is specific to the AlphaServer TS202c system for Tru64 UNIX Version 5.1B. This information was first published in May 2001 as the Release Notes and Installation Instructions for AlphaServer TS202c Systems. In this release, the installation information is integrated into the Tru64 UNIX Installation Guide.
The following information is provided:
Specific features of the operating system that you can configure on the AlphaServer TS202c (Section 6.2.1)
Current restrictions on using certain operating system features (Section 6.2.2)
Using the
mksas
command to create network-bootable
kernels for diskless systems (Section 6.2.3)
Configuring exempt memory regions (Section 6.2.4)
Configuring the AlphaServer TS202c (Section 6.2.5)
6.2.1 Operating System Features
The following features were first added to the Tru64 UNIX Version 5.1 base operating system to support the release of the AlphaServer TS202c:
Network-bootable
standalone kernel that runs on an AlphaServer TS202c system in a diskless
environment.
See
Section 6.2.3
and
mksas(8)
Exempted
memory regions.
See
Section 6.2.4
and the
dump_exmem*
attributes documented in
sys_attrs_generic(5)
Saving
system core dump files to memory and to writing them to a remote host.
See
savecore(8)vmzcore(7)
Specifying
the size and the range of a memory segment that is allocated to a shared
object when it is loaded into memory by the run-time loader.
See
loader(5)
New hardware error codes that support new operating
system features and new hardware features.
See
binlogd(8)
The system has detected a memory error. The system has detected a second fatal error during the processing of a fatal error. An error detected by the memory troller is a correctable error, a noncorrectable error, or a double-bit error.
120 -
Reporting of correctable
errors is disabled because the single-bit error reporting threshold has been
reached.
6.2.2 Restrictions on Operating System Features
The following usage restrictions apply only to Tru64 UNIX Version 5.1B:
The AlphaServer TS202c system firmware revision level must be minimally at Version X6.0-0 to run a network booted kernel.
This release of the operating system does not support installation from a RIS server. This software can only be installed from the distribution CD-ROM.
The Event
Manager (EVM) can fail if the
rc.config
file contains blank
lines.
The failures that may occur include the EVM daemon core dumping or
the system displaying the following errors during the boot process:
evmwatch: Failed to create EVM listening connection evmwatch: Error: Connection error Failed to create EVM listening connection evmwatch: Error: Connection error S97evm: Communication with syslogd is not functioning evmpost: Failed to create EVM posting connection evmpost: Error: Connection error evmpost: Failed to create EVM posting connection evmpost: Error: Connection error evmpost: Failed to create EVM posting connection evmpost: Error: Connection error SysMan authentication server started SysMan Station Server (smsd) started The system is ready.
When you change the ENVMON_HIGH_THRESH attribute value using the
/usr/sbin/envconfig
command line utility, the following error message
is displayed:
env_high_temp_thresh: attribute does not allow this operation
Despite the message, the attribute value is changed and the daemon is restarted.
To view the correct attribute values, use the following command:
# /usr/sbin/envconfig -q
The system incorrectly logs Correctable Environmental Machine Check errors (Error 686) in the binary error log as Uncorrectable Environmental Machine Check errors (Error 682). This problem is specific to the AlphaServer TS202c systems and will be corrected in a future release of the operating system.
A binary error log (binlog) event with type 113 is
reported as a Double Error Halt event when reported by the Event Manager (EVM),
but is reported as a Console Data Log event by the analysis utility.
EVM might
report this event by mailing it to the
root
user and by
displaying it on the system console.
The event is actually a Console Data Log event. This event type is posted when any of several different errors occur, including double error halts, uncorrectable environmental errors, and platform-specific system faults. Refer to the event's translation data for information about its cause.
When the system boots, the
hwautoconfig
utility configures subsystems but does not create device special files for
the subsystems.
To create the device special files for a subsystem, unconfigure
the subsystem by using the
sysconfig -u
command, and then
reconfigure the subsystem using the
sysconfig -c
command.
Alternately, you can work around this problem by adding the subsystem
names to the environment variable SUBSYSTEM_LIST in the
autosysconfig
file before running the
hwautoconfig
utility.
An idle AlphaServer TS202c system might panic if
kmem_debug
is enabled and the
kmem_debug_leak
flag is set.
To avoid this problem, do not set the
kmem_debug_leak
flag.
When
670 and OS Panic events are logged on an AlphaServer TS202c system, the SEL
event information is not logged correctly.
These events are logged with byte
3 set to
DF, instead of
C0
for 660/670
entries and
C1
for OS Panics.
The following events overlap because of the incorrect logging:
x40 PAL detected Bugcheck Errorx43 EV68 detected Dcache Tag Parity Errorx45 EV68 detected Duplicate Dcache Tag Parity Errorx46 EV68 detected Bcache Tag Parity Errorx4A EV68 detected Double Bit ECC Memory Fill Errorx5C EV68 detected Second Dcache Store Data ECC Error
x40 ku_recvfrom - not SO_NAMEx45 m_copym offsetx46 m_copym sanityx43 nfs3_bio: write countx4A nfs_dgreceive 3x5C rtinithead
These events are logged correctly in the binary error log.
The following usage notes apply only to the
mksas
command when used on the AlphaServer TS202c system.
6.2.3.1 Configuration File Requirement
You use the
mksas
utility to generate the bootable
image and in-memory file system.
This image provides you with the ability
to boot over network and to run diskless.
Using
mksas requires
a configuration file for the target system.
For example, if the target host
is TS2ONE, a configuration file for TS2ONE must be located on the host system,
TS2TWO for the host system to generate the bootable image.
6.2.3.2 Restriction on Using the su Command
When you run a kernel that you created by using the
mksas
utility, you cannot use the
su
command
to substitute a user ID to any user ID other than the root user.
(This problem
is caused by an incorrect
umask
setting in the
mksas
utility.)
6.2.3.3 Default Configuration File Does Not Include ftp
To configure the system so that it writes core
dump files to a remote system, you must edit the default configuration file
and include the
ftp
command as an option.
For more information
on editing the default configuration file, see
Section 6.2.3
and
mksas(8)6.2.3.4 The mksas Command Generates Warning Messages
If the configuration file that you specify with the
mksas
command contains duplicate entries, error messages similar
to the following are displayed:
Entry no: 160 -> /usr/lib/sabt/etc/mksas.inittab /etc/inittab WARNING Entry no: 160. Duplicated entry. The file /etc/inittab in the second field is already given earlier. Entry no: 161 -> /etc/rc.config /etc/rc.config WARNING Entry no: 161. Duplicated entry. The file /etc/rc.config in the second field is already given earlier. Entry no: 162 -> /etc/rc.config.common /etc WARNING Entry no: 162. Duplicated entry. The file/etc/rc.config.common in the second field is already given earlier. Entry no: 163 -> /etc/hosts /etc/ WARNING Entry no: 163. Duplicated entry. The file /etc/hosts in the second field is already given earlier.
You can ignore these messages.
The kernel will build without errors.
6.2.3.5 Example addfile Available
An example
/usr/lib/sabt/etc/addlist_file
is supplied with the system.
You use the
addlist_file
to include additional files in the miniroot file system.
The example
is located in the
/usr/lib/sabt/etc
directory and is named
mksas.inv.
6.2.3.6 The -t Option Deletes the Workspace Directory When mksas Completes
You use the
-tdirectory_name
option with the
mksas
command to specify a
temporary work that is used during the kernel build.
When the
mksas
utility completes, it deletes both the temporary work space and
the directory containing the work space.
If you use the
-t
option, do not specify as the
directory_name
any location
containing files that you do not want to delete.
6.2.4 Allocating Exempt Memory
Exempt memory is memory managed by UNIX, but not included
in testing and core dumps.
You can allocate an exempt region of memory by
using
contig_malloc().
This can be done in a pseudodevice
driver in the
postconfig_callback
routine.
The process of allocating exempt memory is similar to a normal contiguous
memory allocation (malloc), with the following exceptions:
You can track
the memory usage by type by using the
vmstat -M
command.
Find the section of display output that is formatted as shown in the following
example:
Memory usage by Type and Number of bytes being used ADVFS = 7088016 KERN = 1317456 SONAME = 11264 AIO = 8192 KERNEL TBL = 456912 STREAMS = 1856 ANON = 111200 KEVM = 65536 STRHEAD = 2560 . . .
The
addrlimit
field is overloaded to allow you to specify a starting
address.
If you specify an address that is not available, the call returns
a failure.
(The alignment parameter is useless in this case.)
A call will appear similar to the following example:
addr = contig_malloc(size, alignment , start_addr, M_EXEMPT, flags); | | | | | Base address to allocate from or | | zero if no preference. | Allocation alignment if no start_addr or zero to | default to base page size. Allocation size in bytes, must be specified.
You can also configure exempt memory by defining the
cma_dd
subsystem,
using a
sysconfigtab
entry similar to the following:
CMA_option = size - 0x40000, alignment - 0, Addrlimit - 0, Type - 150, Flag - 1
The arguments are the same as in the previous call.
The value 150
corresponds to the contents of
malloc.h.
The returned
addr
has the low bit set if there is
a bad page in the allocated region.
6.2.5 Configuring the AlphaServer TS202c
This section provides the following configuration instructions for the AlphaServer TS202c:
How to create a network-bootable kernel (Section 6.2.5.1)
How to configure the server and clients (Section 6.2.5.2)
6.2.5.1 Creating a Network-Bootable Kernel
You use the
mksas
utility to create the network-bootable (SAS) kernel.
By default,
the SAS kernel is created with the minimum number of files and directories
required to boot the system to single-user mode, is contained in a memory
file system, and is intended to run as an in-memory file system.
You can also
build a disk-bootable version of the SAS kernel for testing and debugging.
The
mksas
utility uses the kernel configuration file of
the system that the kernel will run on to build the kernel; it then adds additional
files.
The following steps provide an example of how to build the SAS kernel and boot it across the network. Debugging steps are also provided.
You configure a host system so that it is running a version
of the Tru64 UNIX operating system that supports the
mksas
utility.
The host system must also contain a copy of the AlphaServer TS202c
kernel configuration file.
You create the network-bootable SAS kernel and miniroot file system on the host system. For debugging purposes, you also create a disk-bootable copy of the same SAS kernel and miniroot file system.
You configure
the host system to run BOOTP and
tftp
and identify client
systems that will boot the SAS kernel across the network.
On the client system, you issue a
boot
command specifying the network interconnect as the boot device, for example,
>>>boot eia0.
To add additional features and files to the miniroot file system, you
use the
-a
option and specify an
addfile_list,
or list of additional files, with the
mksas
command.
See
mksas(8) for a complete list of the options available with this
command.
When you add files to the default configuration, you must include
any files that are dependencies.
For example, if you intend to write crash-related
files to a remote host using the
savecore
command, you
must include
/usr/sbin/ftp
and
/usr/sbin/ftpd
in your configuration.
While you add files, you can determine the size of the file system using
the
-C
option of the
mksas
command.
This option determines the size of the miniroot file system, but does not
include the size of the kernel.
Normally, a kernel is approximately 13 MB.
You must add this amount to the size that is returned by the
-C
option to determine the total size of the file system.
AlphaServer systems
support booting an image up to 92 MB.
6.2.5.2 Configuring the Server and Clients
After you have created the network-bootable SAS kernel, you must configure
a server system that the SAS kernel can be booted from and configure client
systems to boot and run the SAS kernel.
The server system requires BOOTP support,
and the
tftp
program, including RFCs 1782 and 1783.
To configure the server, perform the following steps:
Edit the
/etc/inetd.conf
file so that
joind
and
tftp
service
boot requests from the client system, and identify the directory that the
SAS kernel image will be booted from.
To do this, uncomment the following
lines and append the correct directory to the end of the
tftp
line:
#tftp dgram udp wait root /usr/sbin/tftpd tftpd /tmp
#bootps dgram udp wait root /usr/sbin/joind joind
Remove the reference to the
/tmp
directory and replace
it with the directory in which you will store the SAS kernel.
Use the
rcmgr
command to set the run-time
configuration variables for
joind:
# rcmgr set JOIND yes # rcmgr set JOIND_FLAGS ""
Create
or edit the
/etc/bootptab
file.
This file is a text
file containing information that the server needs to boot a remote client.
Use the
xjoin
GUI to edit or create the
/etc/bootptab
file.
See
xjoin(8) and
bootptab(4)
for more information.
Add the following lines to this file:
.ris.dec:hn:vm=rfc1048:sm=255.255.255.0:
.ris0.alpha:tc=.ris.dec:bf=/usr/mksas.kernel:
anchor:tc=.ris0.alpha:ht=ethernet:gw=16.142.160.1:ha=00508B6B32CC:
ip=16.142.160.55:
The
.ris.dec
entry defines characteristics common
to all clients.
The fields specify the following:
hn:
- Informs the boot server to
send the name of the client system to the client when it makes a boot request.
vm:
- Defines vendor-specific information.
The
.ris0.alpha
entry defines characteristics common
to all clients using the
bootp
server.
The fields specify
the following:
tc:
Lets you follow pointers back to common
entries.
For example, the
tc
entry for
.ris0.alpha
points to the
.ris.dec
entry.
The
.ris.dec
entry contains the common hardware type (ht)
and vendor specific (vm) information.
The
.ris0.alpha
entry itself contains common information about the boot file location.
bf:
Specifies the name of the boot file.
The host name entry in this example defines characteristics for a specific client. The fields specify the following:
tc:
Points to
ris0.alpha,
which contains its boot file information.
The
ris0.alpha
entry in turn points back to
ris.dec, which contains relevant
hardware type and vendor specific information.
ht:
Defines the client's hardware type
as
ethernet,
fddi, or
ieee802
(for Token Ring).
ha:
Specifies the client's network hardware
address.
gw:
Defines the gateway.
Start
the
joind
server process using the following command:
/sbin/init.d/dhcp start
To boot the client, you use the following command:
>>> boot eia0
The device name
eia0
is the name of the network interconnect.
This device name may be different
on some AlphaServer systems.
You can use the following console command to
determine the name:
>>> show dev
If you are booting a SAS kernel that is larger than 64 MB, you need
to set the
eia0_TFTP_blocksize
console variable to 1450:
>>> set eia0_TFTP_blocksize 1450
You can write crash dump files to
exempt memory or to a remote host.
Exempt memory dumps are performed by setting
the
dump_to_memory
system attribute and then by identifying the portion
of
exempt memory or all of the memory that the kernel dump subsystem
will use in the event of a memory core dump.
The system attribute
dump_exmem_addr
identifies the start of the dumpable region (as either a virtual or physical
address).
The
dump_exmem_size
system attribute specifies the number of bytes available.
By default, the content of exempt memory is not included in the dump.
You
can change this by setting the system attribute
dump_exmem_include.
See
sys_attrs_generic(5) for more information on these system attributes.
To write crash-related files to a remote host, you use the
-r
option of the
savecore
command.
When using this
option, you specify a host, a username, a password, and the file that is to
be written to the remote host.
You can use a configuration file to specify
the
ftp
information and the name of the file that will
be written.
Using a configuration file allows you to keep the account name
and password private.
See
savecore(8) for more information.
6.3 AlphaServer GS140 Logical Partitions
A single AlphaServer GS140 system can be divided into a maximum of three logical partitions. Each partition is allocated its own dedicated set of hardware resources. A partition is viewed by the operating system and applications software as a single AlphaServer GS140 system.
Logical partitions employ a share nothing model. That is, all hardware resources (processors, memory, and I/O) allocated to a partition are isolated to that partition. Only the instance of an operating system that is running on a partition can access that partition's hardware resources.
You can use logical partitions to reduce floor space requirements, power
consumption, or improve heat dissipation (by reducing computer room cooling
requirements).
For example, two departments in an enterprise with different
computing requirements might run different applications and require different
configuration and tuning of the operating system.
Logical partitioning allows
you to configure a single AlphaServer GS140 computer to meet the computing
needs of both departments.
6.3.1 Hardware Requirements
The hardware requirements for a partition are:
An AlphaServer GS140 with a minimum of six center plane slots
Only the AlphaServer GS140 6-525 is supported. See the Systems and Options Catalog for information on newly supported systems. The logical partitions feature is supported on the AlphaServer GS140 system. An AlphaServer 8400 (upgraded to a GS140 by replacing the processor modules) is also supported.
A console device
This console device can be a character cell video terminal or serial line connection to another system or terminal concentrator. Supported graphics devices can be used by the operating system's windowing software, but not as the console device.
The restriction of a graphics device to the windowing software (which
cannot be the console device) applies only to secondary partitions.
A supported
graphics device can be the console for the primary partition (partition 0).
To use a graphics console, set the value of the
The AlphaServer GS140 includes one console serial port.
This port becomes
the console for the first partition (partition 0).
Each additional partition
requires the installation of a KFE72 option.
This option includes two serial
ports (port 0 is the console port).
See the hardware documentation for the
KFE72 option information and installation instructions.
BOTH
before initializing partitions.
For example:
P00>>> set console BOTH
One dual processor CPU module
One I/O Port (IOP) module
The minimum requirement for a partition is one IOP module. A partition might include a second IOP module. The maximum number of IOP modules for the entire system (the sum of all partitions) is three.
XMI hardware might be used with logical partitions. However, XMI controllers and devices must be configured into partition zero (0). This is a console firmware restriction.
One memory module
The minimum memory size supported for a partition is 512 MB. However, applications running in a partition might require more than the 512 MB minimum memory.
A software load source device (CD-ROM drive or network adapter)
A minimum AlphaServer GS140 console firmware revision level of Version 5.4-19
When installing and configuring logical partitions on a system, see the Release Notes for the operating system release that you are installing, and update the firmware revision if required. See the Installation Guide for information on updating the firmware.
The remainder of this section describes the tasks you perform to configure partitions, and provides information about managing a partitioned AlphaServer GS140 system. The topics covered describe the following activities:
Preparing to install and operate a partitioned system
Verifying system hardware is properly configured for partitions
Verifying the revision level of your system's console firmware and upgrade the firmware if necessary
Configuring partitions for your system by creating the logical
partitioning console firmware
environment variables
(EVs)
Initializing partitions and bootstrap secondary partitions
to console mode (the
P##>>>
prompt)
Installing UNIX and applications software to each partition
Operating and managing a partitioned system
6.3.2 Preparing to Install and Operate Logical Partitions
Read the hardware documentation supplied with your system to become
familiar with the operation of your system.
Of particular interest for partitioning
are the operation of the system's OFF/SECURE/ENABLE/RESET switch and several
console commands (such as
boot,
create,
init,
set, and
show).
Before setting up your partitions, make sure the system hardware is fully installed and passes all self-test diagnostics.
Note
Before installing the operating system software to any partition, read all of the partitioning procedure. There are certain aspects of managing a partitioned system you must be aware of prior to making the system operational. Precautions must be taken to prevent actions by the console on a partition from interfering with operation of another partition.
The next section describes logical partitioning terms used throughout
the rest of this document.
After reviewing these terms, proceed to
Section 6.3.3.
6.3.2.1 Definition of Commonly Used Terms
Familiarize yourself with the following terms before configuring your
partitions.
logical partition
A logical grouping of hardware resources (CPU, I/O, MEMORY, and console) within a single system for exclusive use by an instance of the operating system. A single physical system might have several logical partitions, each running a separate instance of the operating system.
Partition number zero. The partition with the active console terminal if partitioning is disabled (that is, all hardware resources are in one partition).
Partition with a number greater than zero.
One of the partitions
that displays the console prompt after the
lpinit
command
is executed on the primary partition's console.
The console terminal connected to the primary partition. The only active console terminal if partitions are disabled.
The console terminal connected to a secondary partition. Active only if partitions are enabled.
The four-position switch located on the AlphaServer GS140 control panel. The four positions perform the following functions:
OFF - System power (all partitions) is off.
SECURE - Power is applied to the system (all partitions).
The primary console's
ctrl/p halt
function is disabled.
ENABLE - Power is applied to the system (all partitions).
The primary console's
ctrl/p halt
function is enabled.
RESET - This is a momentary position. Moving the switch to RESET and then releasing it causes a complete initialization of the system. All secondary partitions are immediately terminated. The primary partition displays the normal power-on self-test messages and enters console mode.
The prompt displayed on the console terminal of a partition to indicate the console firmware is ready to accept commands:
P##>>>
Where
##
is the processor
number on which the console firmware is currently executing.
This is normally
the primary processor of the current partition as shown in the following examples:
For partition 0 with CPU 0:
P00>>>
For partition 1 with CPU 4:
P14>>>
ctrl/p haltHolding down the control key and typing the letter
p
causes the primary processor for partition 0 to halt and enter
console mode (P00>>>
prompt).
This is possible only on
the primary console.
The
halt
operation can be disabled
by setting the power switch to the
SECURE
position.
The
halt operation is ignored on secondary partitions.
P##>>>stop NTyping
stop
N
at the
console prompt (P##>>>) causes processor
N
to halt and
enter console mode.
Issuing this command on the primary console terminal can
stop any processor in any partition.
For example, if the primary processor
for partition 1 is processor 4, the following command causes processor 4
to enter console mode:
P00>>>stop 4
P##>>>continue
NIf processor
N
entered console
mode as the result of a
ctrl/p halt
or
stop
N
command, typing
continue
N
at the
P##>>>
prompt causes the
processor to resume program execution.
For example:
P##>>>continue 4
If you halt a single processor you
can omit the processor number (N).
P##>>>initTyping
init
at the console (P##>>>) prompt of any partition causes a complete reinitialization of
the entire system.
All active partitions are immediately terminated and the
system is reset (as if the power switch is momentarily moved to the
RESET
position).
If partitions are enabled, the console requests
verification of the
init
command by displaying the following
prompt:
Do you really want to reset ALL partitions? (Y/<N>)
Type
Y
to complete the
init
command or
N
to cancel it.
Each of the following sections describes a task you perform to partition your AlphaServer GS140 system. Each task is performed in the order presented, although some tasks might be skipped in certain cases.
If you have read this section previously, and require only a summary of the normal sequence of startup commands, they are:
P00>>> set lp_count n
(Set the count of n logical partitions)
P00>>> init
(Initialize the primary partition)
P00>>> lpinit
(Start the secondary partitions)
P00>>> boot
(Boot the primary partition)
P##>>> boot
(boot the secondary partitions)
Improper operation results if the
lpinit
command
is omitted.
The console firmware prevents this by automatically executing
the
lpinit
command if the
lp_count
is
nonzero and a boot command is issued on the primary partition's console terminal.
On startup, each secondary partition displays configuration information.
It is possible for this message to be proceeded by a series of Y characters
as described in
Section 6.3.3.8.
This is not an error and can
be ignored.
6.3.3.1 Verifying Your System's Hardware Configuration
You need to verify that your hardware is properly configured for logical partitioning. You also need to record certain information about your hardware configuration for later use (when you configure partitions). Follow these steps to verify your hardware configuration:
Power on your system by setting the power
OFF/ENABLE
switch to the
ENABLE
position.
Note
A newly installed system (with factory installed software) or an existing system with the
auto_actionconsole EV set to BOOT or RESTART, automatically boots the operating system disk after the hardware's self-test is completed. In this case, you need to interrupt the automatic boot by typingctrl/cat the console terminal. If you cannot interrupt the automatic boot, allow the operating system boot completely, then shut it down (do not typectrl/pto halt the automatic boot). See the Installation Guide for information on factory installed software before you attempt to set up logical partitions.The factory installed software disk might be used as the system disk for one of the partitions. See Section 6.3.6 for information on installing the operating system.
After a short delay (about 15 seconds) configuration information similar to the following example is displayed on the primary console screen:
F E D C B A 9 8 7 6 5 4 3 2 1 0 NODE #
A A M . M P P P P TYP
o o + . + ++ ++ ++ ++ ST1
. . . . . EE EE EE EB BPD
o o + . + ++ ++ ++ ++ ST2
. . . . . EE EE EE EB BPD
+ + + . + ++ ++ ++ ++ ST3
. . . . . EE EE EE EB BPD
. + + + . + + + C0 PCI +
. . . . . . + . . + . . + + C1 XMI +
. . . . . . . . . . . . . . . . C4
+ . + + . . . . + . . + C5 PCI +
. . . . . . . . . . . . . . . . C6
+ . + + . + + + . . . + C7 PCI +
. . . + . . . . EISA +
. . A1 . A0 . . . . ILV
. . 1GB . 1GB . . . . 2GB
Compaq AlphaServer GS140 8-6/525, Console V5.4 15-MAR-99 10:07:33
SROM V1.1, OpenVMS PALcode V1.48-3, Tru64 UNIX PALcode V1.45-3
System Serial = , OS = UNIX, 12:58:49 March 15, 1999
Configuring I/O adapters...
isp0, slot 0, bus 0, hose0
isp1, slot 1, bus 0, hose0
tulip0, slot 2, bus 0, hose0
isp2, slot 4, bus 0, hose0
isp3, slot 5, bus 0, hose0
tulip1, slot 6, bus 0, hose0
demna0, slot 1, bus 0, xmi0
kzmsa0, slot 2, bus 0, xmi0
kzmsa2, slot 5, bus 0, xmi0
kzpsa0, slot 3, bus 0, hose5
tulip2, slot 8, bus 0, hose5
tulip3, slot 9, bus 0, hose5
pfi0, slot 11, bus 0, hose5
tulip4, slot 12, bus 0, hose7
floppy0, slot 0, bus 1, hose7
kzpsa1, slot 4, bus 0, hose7
tulip5, slot 4, bus 2, hose7
tulip6, slot 5, bus 2, hose7
tulip7, slot 6, bus 2, hose7
tulip8, slot 7, bus 2, hose7
pfi1, slot 6, bus 0, hose7
pfi2, slot 8, bus 0, hose7
kzpsa2, slot 9, bus 0, hose7
P00>>>
The line ending with
NODE #
indicates the
slot number (referred to later in the configuration process).
Your system
provides up to nine slots, each of which is labeled with its slot number.
The next line (ending with
TYP) indicates the type of module
in each slot.
Record the type of module in each slot:
P = CPU (dual processor CPU module) M = MEM (memory module) A = IOP (IO port module) 8 7 6 5 4 3 2 1 0 +---+---+---+---+---+---+---+---+---+ | | | | | | | | | | | | | | | | | | | | +---+---+---+---+---+---+---+---+---+
Divide your system into logical partitions by assigning slots
(and therefore modules) to each partition.
Each partition must be assigned
at least one dual
CPU
module, one
MEM
module, and one
IOP
module.
With a total of nine slots,
the AlphaServer GS140 can be configured for a maximum of three partitions.
Note
Each
CPUmodule includes two processors, both of which must be assigned to the same partition.
If your system meets the minimum requirements, proceed to the next section. Otherwise, you need to take corrective action (such as installing additional hardware), then proceed to the next section.
6.3.3.2 Verifying the Firmware Revision Level
Logical partitions require console firmware support. See the Release Notes for changes to the minimum revision. To verify that your system's firmware includes support for logical partitions, use the following command at the primary console to display the firmware revision level:
P00>>>show version
The console displays a message similar to the following:
version V5.4, 15-MAR-1999 10:07:33
Verify the revision
of your firmware is Version 5.4 or later.
If you need to upgrade your system's
firmware, see the firmware upgrade instructions in the hardware documentation.
The firmware CD-ROM is shipped with the software kit, or you can download
the firmware from the World Wide Web or using
ftp.
The
information on finding and updating the firmware is in the
Installation Guide.
6.3.3.3 Configuring Logical Partitions
You configure and enable (or disable) logical partitions using a set of console environment variables (EVs). Two console EVs take the form of hexadecimal numbers, which are bit masks in which a bit position in the mask corresponds to a module or processor number. Hardware configuration rules require modules to be installed in specific slot numbers, based on the module type, according to the following criteria:
IO
port (IOP) modules are installed in
slots 8, 7, and 6 in descending order with a maximum of three
IOP
modules allowed.
CPU
(dual processor) modules are installed
in slots
0
through
N
in ascending order
(N
depends on the number of
CPU
modules
installed).
The value of
N
is limited by the number of
IOP
and
MEM
modules.
MEM
(memory) modules are installed in any
available slot between the highest numbered
CPU
module
and the lowest numbered
IOP
module.
Set the processor mask variable (lp_cpu_mask)
by shifting the number 3 by two times the slot number of the
CPU
module.
Possible
CPU
masks for each slot are:
Processors 00 and 01 (slot 0): 3 << (2 * 0) = 003
Processors 02 and 03 (slot 1): 3 << (2 * 1) = 00c
Processors 04 and 05 (slot 2): 3 << (2 * 2) = 030
Processors 06 and 07 (slot 4): 3 << (2 * 4) = 0c0
Processors 08 and 09 (slot 5): 3 << (2 * 5) = 300
Processors 10 and 11 (slot 6): 3 << (2 * 6) = c00
Calculate the value of the
lp_cpu_mask
variable by
combining (using a logical OR operation) the masks for individual
CPU
module slots.
For example, to assign the four processors on
the
CPU
modules in slot 0 and 1 to partition 0, you assign
the
lp_cpu_mask0
variable a value of
00f.
Set the I/O port mask variable (lp_io_mask) by left
shifting the number 1 by the slot number of the IOP module.
Potential IOP
masks for each slot are:
IO Port module in slot 8: 1 << 8 = 100 IO Port module in slot 7: 1 << 7 = 080 IO Port module in slot 6: 1 << 6 = 040
If a partition consists of two IOP modules, create the value of the
lp_io_mask
variable by combining (using a logical OR operation)
the masks for individual IOP module slots.
For example, if you assign IOP
modules in slots 7 and 8 to partition 1, the value of the
lp_io_mask1
variable is 180.
When assigning IOP modules to secondary partitions, it is important
to remember that one of the IOPs assigned to the partition must be connected
to a DWLPB option with a KFE72 option installed.
The KFE72 option provides
the console serial port for secondary partitions.
6.3.3.4 Determining and Setting Environment Variables
To create the console environment variables for your logical partitions,
first determine the number of partitions and which slots (that is,
CPU,
MEM, and
IOP
modules)
are assigned to each partition (using the module types and slot numbers you
recorded previously).
Then, you can create the console EVs.
A summary of console EVs and values follows:
| Console EV | Value |
lp_count |
Number of partitions |
lp_cpu_maskN |
CPU
assignment mask for
partition
N |
lp_io_maskN |
IOP
module assignment
mask for partition
N |
lp_mem_mode |
Memory isolation mode |
The following table shows a sample configuration of two partitions based on the configuration information in Section 6.3.3.3, with the following modules:
4
CPU
modules (in slots 0 through 3)
2
MEM
modules (in slots 4 and 6)
2
IOP
modules (in slots 7 and 8)
| Partition | Modules |
| Partition 0 | CPU
modules in slots 0
and 1 (CPU 0-3, mask = 00F) |
IOP
module in slot 8 (I/O
Port, mask = 100) |
|
MEM
module in slot 6 (2GB
memory) |
|
| Partition 1 | CPU
modules in slots 2
and 3 (CPU 4-7, mask = 0F0) |
IOP
module in slot 7 (I/O
Port, mask = 080) |
|
MEM
module in slot 4 (1GB
memory) |
There is no console EV mask for memory. The console firmware assigns memory modules to partitions. The firmware attempts to balance the amount of memory assigned to each partition.
To create or change the EVs, execute the following commands at the console prompt. The values used are for the two-partition example described at the start of this section. The actual values you enter depend on your hardware configuration and your partition layout.
The value of the
lp_count
EV is zero (which changes
later).
The following command displays the console EVs if you have created them. No output appears if the console EVs do not exist.
P00>>>show lp*
If the console EVs do not exist (were not previously created) use the following commands to create the EVs.
There is a 10 second delay after you issue each command and that the console echoes the value of each EV after you create it.
P00>>>create -nv lp_count 0 P00>>>create -nv lp_cpu_mask0 f P00>>>create -nv lp_cpu_mask1 f0 P00>>>create -nv lp_io_mask0 100 P00>>>create -nv lp_io_mask1 80 P00>>>create -nv lp_mem_mode isolate
If the console EVs already exist (previously created), use these commands to set their values:
P00>>>set lp_count 0 P00>>>set lp_cpu_mask0 f P00>>>set lp_cpu_mask1 f0 P00>>>set lp_io_mask0 100 P00>>>set lp_io_mask1 80 P00>>>set lp_mem_mode isolate
Use the information in the following two sections to display (and if
necessary correct) the console EV settings.
6.3.3.5 Displaying Console Environment Variables
Display the value of a console EV on the console of any partition by
using the
show
command.
For example, to display the value
of
lp_count
enter the following:
P00>>>show lp_count
To display all the partitioning EVs, enter the following:
P00>>>show lp*
If the console
EVs are correct, ignore the next section and proceed to
Section 6.3.3.7.
Otherwise, continue with
Section 6.3.3.6
and make any necessary
corrections.
6.3.3.6 Correcting Console Environment Variables
Note
You must set console EVs with
lp_prepended to the EV name by using only the console of the primary partition (partition 0). You must not change the value of these variables on any secondary partition.
Use the
set
command to change the value of any or
all the console EVs.
For example, to change all the EVs, enter the following:
P00>>>set lp_count 0 P00>>>set lp_cpu_mask0 f P00>>>set lp_cpu_mask1 f0 P00>>>set lp_io_mask0 100 P00>>>set lp_io_mask1 80 P00>>>set lp_mem_mode isolate
6.3.3.7 Disabling Automatic Boot Reset
The
Installation Guide
recommends setting the
boot_reset
console environment variable to ON.
This setting is not compatible with logical
partitions for which the
boot_reset
console EV must be
set to
OFF.
This is required so booting a partition does
not interfere with the operation of other (previously booted) partitions.
If the
boot_reset
console EV is set to
ON,
then a system-wide reset happens after you execute the boot command (P00>>>boot).
This reset immediately terminates the operation of
all partitions.
Execute the following command to disable the
boot_reset
console EV:
P00>>>set boot_reset off
6.3.3.8 Setting Memory Interleave Mode
Set the value of the
interleave
console EV to
none:
P00>>>set interleave none P00>>>init
When
setting the interleave mode to none, the console might echo a series of Y
characters to the console display screen.
(There might be several lines of
Y characters.) Ignore this output.
6.3.3.9 Setting the Operating System Type to UNIX
Set the value of the
os_type
console EV to UNIX:
P00>>>set os_type UNIX
6.3.3.10 Setting the auto_action Console Environment Variable
To halt the processor after a POWER-ON or RESET (using the RESET switch), use the following command:
P00>>>set auto_action halt
To automatically boot the operating system after a POWER-ON or RESET, use the following command:
P00>>>set auto_action boot
Before installing Tru64 UNIX to partitions, you must initialize
the partitions.
This operation assigns hardware resources (CPU,
IOP, and
MEM
modules) to each partition and spawns
a console for each secondary partition.
Use the following procedure:
Set the value of the
lp_count
EV to the
number of partitions.
For example, to enable two partitions:
P00>>>set lp_count 2
Initialize partition 0:
P00>>>init
Configuration information
(as previously described) is displayed on the primary console screen, followed
by the console prompt;
P00>>>.
Initialize all secondary partitions.
P00>>>lpinit
A series of partition configuration messages is displayed on the primary console, including the starting address of physical memory for each partition. Record these addresses so you can determine whether a kernel rebuild is needed in the event of a memory configuration change.
The following example shows a typical configuration display:
Partition 0: Primary CPU = 0 Partition 1: Primary CPU = 4 Partition 0: Memory Base = 000000000 Size = 080000000 Partition 1: Memory Base = 080000000 Size = 040000000 No Shared Memory LP Configuration Tree = 128000 starting cpu 4 in partition 1 at address 040010001 starting cpu 5 in partition 1 at address 040010001 starting cpu 6 in partition 1 at address 040010001 starting cpu 7 in partition 1 at address 040010001
For each secondary partition configured, information is displayed
on the secondary console screens, followed by a console prompt such as
P04>>>.
There is a 20-second delay after you enter the
lpinit
command before the secondary consoles display their configuration
information.
6.3.5 Correcting Interleave Mode Errors
If the
interleave
EV is incorrectly set, the console
displays the following error message:
Insufficient memory interleave sets to partition system. Issue command "set interleave none" then reset system.
To recover from this error, enter the following commands:
P00>>>set interleave none
P00>>>set lp_count 0
P00>>>init
Then repeat the steps
in this section.
6.3.6 Installing the Operating System
After the partitions are configured and initialized, you can install the operating system to each partition. Install the operating system by following the instructions in the Installation Guide.
AlphaServer GS140 systems ship with Tru64 UNIX preinstalled on
one of the disks.
You can use this disk as the root disk for one of the partitions
(usually partition 0).
To use the preinstalled disk, boot it and follow the
instructions for completing the installation.
By default, the
bootdef_dev
console EV is set to automatically boot the preinstalled disk.
If it is not, use the
bootdef_dev
value you recorded in
Section 6.3.3.1.
Note
Depending on how you assigned IOP modules, the name of the factory installed software (FIS) disk might change and might not be assigned to partition 0. You can use the following command in each partition to locate the disk:
P##>>> show device
The operating system can also be installed from a CD-ROM or over the
network from a Remote Installation Server (RIS).
Configuring a CD-ROM drive
on all partitions might not always be practical and a RIS server might not
be available.
An alternative, (assuming a local network is available) is to
install the operating system to one partition from a CD-ROM, then configure
that partition as a RIS server for the other partitions.
Refer to the
Sharing Software on a Local Area Network
manual for instructions on setting up a Remote Installation Server.
6.3.7 Managing a Partitioned System
The operating system running in each partition can be managed as if
it were running on a system that is not partitioned.
However, there are some
AlphaServer GS140-specific operational characteristics that you must be aware
of and take into account when managing a partitioned system.
These topics
are documented in the following sections.
6.3.7.1 Operational Characteristics
During the course of normal partitioned system operations you might
need to repeat some of the configuration and initialization tasks.
Some of
these tasks require special precautions to prevent interference between partitions.
The following sections describe these tasks.
6.3.7.1.1 Console init command (P##>>>init)
Typing the
init
command at the console prompt in
any partition reinitializes the entire system.
This immediately terminates
the operating system on all partitions.
Therefore, do not execute the
init
command unless you need to reinitialize the entire system.
When you execute the
init
command, the console prompts
you to confirm that you actually want to reset all partitions.
Answer
no
to abort the
init
command or
yes
to continue with the
init
command.
6.3.7.1.2 Shutting Down or Rebooting the Operating System
To shut down the operating system running in a partition and return
to console mode (P##>>>
prompt), use the
shutdown
command.
For example:
# /usr/sbin/shutdown -h +5 "Shutting down the OS"
The
shutdown
command can also shut down and reboot the operating system.
For example:
# /usr/sbin/shutdown -r +5 "Rebooting the OS"
6.3.7.2 Recovering an Interrupted Operating System Boot
An incomplete or interrupted operating system boot might leave the console boot drivers in an inconsistent state. In this case, the console displays the following message:
Inconsistent boot driver state. System is configured with multiple partitions. A complete INIT must be performed before rebooting.
Use the following procedure to recover from this condition:
Shut down the operating system in all running partitions.
Execute the following commands on the primary console:
P00>>>set lp_count 0 P00>>>init P00>>>set lp_count N
(where
N
is the number of partitions)
P00>>>init P00>>>lpinit
Boot the operating system in each partition. For example:
P00>>>boot P04>>>boot
Under normal operating conditions, it is not necessary to manually halt processors. The processor halts and enters console mode after you shut down the operating system. You must manually halt the processor if the operating system hangs for some reason (for example, while debugging a loadable device driver).
Note
In the unlikely event that the processor cannot be halted the system must be reset by momentarily setting the four way OFF/ENABLE switch to the RESET position, then releasing it.
The following procedures work only if the Power OFF/ENABLE switch is
in the ENABLE position.
Primary Partition
Pressing
[Ctrl/p]
on the primary console terminal forces
the primary processor to enter console mode and display the
P##>>>
prompt.
You can use the
stop
N
command (where
N
is a processor number) to stop secondary
processors (though this is not normally necessary).
See
Section 6.3.2.1
for definitions of the console prompt and the
stop
command.
Secondary Partitions
Secondary partitions do not halt in response to a [Ctrl/p] command on the secondary console terminal. To force a secondary partition to enter console mode, use the following proedure:
Shut down the operating system on the primary partition:
# /usr/sbin/shutdown -h +5 "Shutting down the OS"
Stop the primary processor of the secondary partition.
P00>>>stop N
Where N is the CPU number of the primary processor of the secondary partitions (normally the lowest numbered CPU assigned to the secondary partition). For example:
P00>>>stop 4
6.3.7.4 Power OFF/ENABLE Switch Position
During normal system operation, the Power OFF/ENABLE switch is set to
the SECURE position.
This prevents you from accidentally halting the processor
with
[Ctrl/p].
6.3.7.5 Reconfiguring Partitions by Changing Console EVs
The console EVs that control logical partitions (names begin with
lp_) must not be changed on any secondary partition.
You can change
these console EVs only by shutting down all partitions and setting new values
on the primary partition's console terminal.
After you have determined the layout of the new partition, follow these steps to reconfigure your partitions:
Shut down the operating system in each partition:
# /usr/sbin/shutdown -h +5 "Shutting down to reconfigure partitions"
Disable partitions and reset the system:
P00>>>set lp_count 0 P00>>>init
Use the console
set
command to change the
value of any or all of the console EVs.
For the two-partition example discussed
in
Section 6.3.3.4, use the following commands:
P00>>>set lp_count 2 P00>>>set lp_cpu_mask0 f P00>>>set lp_cpu_mask1 f0 P00>>>set lp_io_mask0 100 P00>>>set lp_io_mask1 80 P00>>>set lp_mem_mode isolate
Initialize the primary partition:
P00>>>init
Initialize all secondary partitions:
P00>>>lpinit
Boot the operating system in each partition using commands similar to the following:
P00>>>boot P04>>>boot
6.3.7.6 Checking Other Console EVs Before Booting
Before booting the operating system in each partition, use the console
show
command to verify the correct state of the console EVs:
P0##>>>show boot_reset
The
boot_reset
EV must be
off.
P0##>>>show interleave
The
interleave
EV must
be
none.
P0##>>>show auto_action
The
auto_action
EV
can be set to
HALT
or
BOOT.
P0##>>>show os_type
The
os_type
EV is set to UNIX.
6.3.7.7 Logical Partitioning Informational Messages at Boot Time
If you configure and enable logical partitions, the operating system displays informational messages for each partition. These messages appear on the console terminal during the early stages of the bootstrap process. The following example shows typical messages for a two partition system:
Partition 0 ----------- LP_INFO: 2 partition(s) established via lp_count LP_INFO: primary processor for partition 0 is CPU 0 LP_INFO: partition 0 CPU allocation mask = 0xf LP_INFO: partition 0 IOP allocation mask = 0x100 LP_INFO: Memory partitioning mode set to isolate LP_INFO: partition 0 memory starting address = 0x0 Partition 1 ----------- LP_INFO: 2 partition(s) established via lp_count LP_INFO: primary processor for partition 1 is CPU 4 LP_INFO: partition 1 CPU allocation mask = 0xf0 LP_INFO: partition 1 IOP allocation mask = 0x80 LP_INFO: Memory partitioning mode set to isolate LP_INFO: partition 1 memory starting address = 0x80000000
These messages provide the following information:
Number of active partitions
Number of the primary processor for the current partition
Which processors are allocated to the current partition
Which I/O port modules are allocated to the current partition
Memory partitioning mode (which is always set to
isolate)
Starting address of memory for the current partition
6.3.8 Hardware Management and Maintenance
For the AlphaServer GS140, partitions share a common physical enclosure and hardware (such as power supplies, system bus, and control panel power switch). You cannot perform the following hardware management and maintenance tasks on individual partitions. You must disable partitions and reset the system to a unpartitioned state.
Tasks that require a complete system reinitialization are:
Performing corrective or preventive maintenance on system hardware.
Installing AlphaServer GS140 firmware upgrades, including I/O controller firmware upgrades.
Adding or removing system hardware components (CPUs, memory, IOPs, PCI buses, I/O controllers, and I/O devices [except for hot swappable disks]).
Changing any partition's hardware resource assignments by
modifying any console EV with
lp_
prepended to its name.
Running the ECU EISA Configuration Utility (ECU) or the Raid Configuration Utility (RCU) from the floppy disk drive.
6.3.8.1 Obtaining Technical Support
If you need to escalate a problem to your technical support organization, it is important that you tell the customer services representative that the system is partitioned (particularly if the service operation uses remote diagnosis). When you place the service call, state that your system is using logical partitions.
The logical partitioning software provides two methods for the customer
services representative to determine whether or not a system is partitioned.
The
LP_INFO
messages printed during operating system startup
are also entered into the binary error log as part of the Startup ASCII Message.
You can run the
sizer -P
command on any instance of the
operating system to display the partitioning status of the system:
# sizer -P Host hostname is instance 1 of 2 partitions. Physical memory starts at address 0x80000000. Memory mode is isolate. Processors assigned to instance 1: 4 5 6 7 IO Port (s) assigned to instance 1: slot 7
If the system is not partitioned, the following message is displayed, where hostname is the name of the system:
Host hostname is not partitioned.
6.3.8.2 Performing Hardware Management and Maintenance Tasks
Before performing any management or maintenance tasks, you must terminate operation of all partitions and return the system to an unpartitioned state. Use the following steps to shut down partitions:
Shut down the operating system in each partition:
# /usr/sbin/shutdown -h +5 "Shutting down for maintenance"
Disable partitions by executing the following command at the primary console terminal:
P00>>>set lp_count 0
Set the
auto_action
console EV for the
primary partition to HALT:
P00>>>set auto_action halt
You might need to reset the
auto_action
EV in step 1 of the next procedure, initializing and
rebooting the partitions.
Reinitialize the system by typing this command on the primary console terminal.
P00>>>init
When the system returns to the
P00>>>
prompt you
can perform system management and maintenance tasks.
After completing system
management and maintenance tasks, use the following procedure to reinitialize
and reboot your partitions:
Verify the console EVs are set to the correct values:
P00>>>show lp* P00>>>show boot_reset P00>>>show interleave P00>>>show auto_action
The
boot_reset
EV is set to
off, the
interleave
EV is set to
none, and the
auto_action
EV is set to either
HALT
or
BOOT.
Set the
lp_count
EV to the correct number
of partitions.
For example:
P00>>>set lp_count 2
Initialize the primary partition:
P00>>>>init
Initialize all secondary partitions:
P00>>>lpinit
Boot the operating system on each partition. If you changed the system's hardware configuration or reassigned any hardware resources to a different partition, a kernel rebuild might be required. Use the procedure in Section 6.3.9 to determine if you need to rebuild the kernel for any partition.
If you do not require a kernel rebuild, boot the operating system:
P##>>>boot
Where
##
is the CPU number of the partition's primary processor.
6.3.9 Hardware Changes Requiring a UNIX Kernel Rebuild
If you change your system's hardware configuration you might need to rebuild the kernel. The following table defines the hardware configuration changes that require a rebuilt kernel:
| Change | Requirements |
Processors - adding, removing, or reassigning CPU modules. |
Changing the
|
I/O Processors - adding, removing, or reassigning IOP modules. |
Rebuild the kernel if you added or
removed an IOP module.
You need only rebuild the kernel for the changed partition).
Moving an IOP module across partitions requires a kernel rebuild on both partitions.
The
|
Adding or removing I/O buses and I/O controllers requires a kernel rebuild for the affected partition. |
|
Memory Modules - changing the memory module configuration. |
For the primary partition (partition 0), changes to the memory module configuration do not require a kernel rebuild. |
The kernel for any secondary partition must be built to run at a specific memory address (that is, the physical memory starting address for the partition). Certain types of memory reconfiguration change this address and require a kernel rebuild. A partition's memory starting address changes if the memory size for any lower numbered partition increases or decreases. |
|
For example, if you replaced a 2GB memory module in partition zero with a 4GB module, the memory starting address of partition one increases by 2GB. In this example you must rebuild the kernel. |
|
Rebuild the kernel if a secondary partition's kernel fails to boot after a memory module configuration change. |
|
The memory starting address for each
partition is displayed at the primary console after each iteration of the
|
6.3.9.1 How to Rebuild the UNIX Kernel for a Partition
The following steps describe how you rebuild the kernel, which is a
special case of the typical kernel build instructions documented in the
System Administration
manual.
This procedure assumes that you initialized partitions as described
in
Section 6.3.4
and the partition requiring a kernel
rebuild is halted at the
P##>>>
console prompt.
Refer to
the kernel configuration information in the
System Administration
manual for information
on:
Kernel booting and the single-user mode prompt
Saving and copying kernels
Boot the generic kernel to single-user mode:
P##>>>boot -fl s -fi genvmunix
Check and mount file systems:
# /sbin/bcheckrc
Refer to the System Administration manual for more information on mounting file systems.
Set the host name (system name) for this partition:
# hostname NAME
Rebuild the kernel:
# doconfig
Note
You must not use
doconfigwith the-coption to rebuild the kernel.
Save the current kernel:
# cp /vmunix /vmunix.save
Install the new kernel, where SYSNAME is the local host name:
# cp /sys/SYSNAME/vmunix /vmunix
Unmount the file systems:
# umount -a
Halt the operating system:
# sync # sync # halt
Boot the new kernel:
P##>>>boot
6.3.10 Handling Nonrecoverable Hardware Error Machine Checks
There are two main classes of hardware errors:
Recoverable errors are corrected by the hardware and reported to the operating system. The operating system logs recoverable errors in the binary error log and continues normal system operation. Nonrecoverable hardware errors require immediate termination of normal system operation and some form of corrective action (such as a system reset).
Nonrecoverable hardware errors are reported to the operating system as a machine check. The operating system crashes with a panic message, such as the following:
panic (cpu 0): tlaser: \ MACHINE CHECK Non-recoverable hardware error
The
system then writes out a crash dump, and reboots or halts (depending on the
setting of the
auto_action
console EV, which can be BOOT
or HALT).
Some hardware errors require a complete system reset before the operating system can be rebooted.
For system-wide hardware faults, the operating system forces a system
reset after writing the crash dump.
After the reset is completed, if
auto_action
is set to BOOT, the console firmware automatically reinitializes
all partitions.
Boot the operating system in each partition, using the following
commands:
P00>>>boot P##>>>boot
Otherwise,
the system halts and enters console mode (P00>>>
prompt).
If this occurs, enter the following commands to restart partitions and reboot
the operating system (where
N
is the number of partitions):
P00>>>set lp_count N P00>>>init P00>>>lpinit P00>>>boot
For each secondary partition, enter the boot command:
P##>>>boot
For local hardware faults (contained within a partition), the operating system running in the affected partition unconditionally halts after writing the crash dump. This allows other partitions to continue operating until a shut down can be scheduled. Restarting the affected partition requires a complete system reset, using the following procedure:
Shut down the operating system in each running partition:
# /usr/sbin/shutdown -h +5 "Shutting down for error recovery"
At the primary console terminal, enter the following commands:
P00>>>set lp_count 0 P00>>>init
The console displays the following prompt:
Do you really want to reset ALL partitions? (Y/<N>)
Type
Y
to perform the reset.
After the reset is complete, and if
auto_action
is set to BOOT, the console firmware automatically reinitializes
all partitions.
Boot the operating system in each partition, using the following commands:
P00>>>boot P##>>>boot
Otherwise, enter the following commands (where N is the number of partitions):
P00>>>set lp_count N P00>>>init P00>>>lpinit P00>>>boot
For each secondary partition enter the following:
P##>>>boot
If these recovery procedures fail to restore full system operation
for all partitions, reset the system manually by momentarily moving the OFF/ENABLE
switch to the RESET position, then releasing it.
Repeat the recovery procedure
after the reset completes.
If the failure persists, contact your technical
support organization.
6.3.11 Logical Partitioning Error Messages
If an error condition occurs (such as an invalid partition configuration) the partition's console terminal displays an error message. After displaying the error message, the primary processor for the current partition halts and returns to the console prompt. To recover from any of these errors, correct the logical partitioning console EVs and reboot the partition.
The following error messages might be displayed:
LP_ERROR: invalid partition count (lp_count = #, max nodes
= #)The
lp_count
console
EV is set incorrectly.
The value is less than zero or exceeds the maximum
number of partitions supported for the AlphaServer GS140.
LP_ERROR: no CPUs for partition (check lp_cpu_mask)The value of
lp_cpu_mask#
(#
represents the current partition number) is set incorrectly.
This partition has no allocated processors.
LP_ERROR: no IOP for partition (check lp_io_mask)The value of
lp_io_mask#
(#
represents the current partition number) is set incorrectly.
This partition has no allocated I/O Port modules.
LP_ERROR: lp_count >
1, but partitions not initialized Please
execute 'lpinit' command at >>>
promptThe message indicates that partitions were configured, but not initialized.
LP_ERROR: must set lp_mem_mode [share or isolate]The
lp_mem_mode
console EV is not set or
set incorrectly.
For logical partitions,
lp_mem_mode
must
be set to
isolate.
Bootstrap address collision, image loading abortedThe kernel's link address does not match the memory starting address of the partition. Refer to Section 6.3.9 for instructions on how to recover from this error.
6.3.12 Understanding Console Firmware Error or Informational Messages
The console firmware implements several safety checks during certain events (such as system reset and partition startup). These checks help prevent cross-partition interference. The partition's console displays one of the following messages if an anomaly is detected:
Do you really want to reset ALL partitions? (Y/<N>)This message is displayed after a system reset is requested,
either by the operation issuing the
init
command or as
a result of booting with the
boot_reset
console EV set
to ON.
This message warns you that continuing with the operation will terminate
all partitions and reset he system.
If a reset is necessary, shut down the
operating system in all operational partitions before proceeding with the
reset.
Auto-Starting secondary partitions...This message indicates the console firmware is initializing logical
partitions (by running the
lpinit
command automatically).
An auto-starting event occurs after a system reset (or power on).
The console
firmware boots the operating system in all partitions provided that:
The
auto_action
console EV is set to BOOT
You initiate the reset by using the RESET switch on power-on,
and not by using the
init
command
Insufficient memory interleave sets to partition system.Issue
command "set interleave none" then reset system.This message indicates that the interleave console EV is incorrectly
set.
Change the setting to
none.
Insufficient memory modules to partition system.Each partition requires a dedicated memory module. Reduce the number of partitions or install a memory module for each partition.
This message indicates that the
lp_count
console
EV might not be set correctly.
For example, you have two partitions, but
lp_count
is set to four.
In this case, set
lp_count
to match the actual number of partitions.
Inconsistent boot driver state.System is configured
with multiple partitions.A complete INIT must be performed
before rebooting.An incomplete or interrupted operating system boot caused the console boot drivers to enter an inconsistent state. Refer to Section 6.3.7.2 for instructions on recovering from this state.
Do you want to attempt to boot secondary partitions anyway?
(Y/<N>).This message indicates that the
console detected an inconsistency in your partitions set up (probably due
to incorrect setting of
lp_
console EVs).
Unless you are
certain it is safe to proceed, answer no (N) to this question
and correct the inconsistency.
TIOP # not configured in any partition.Non-existent
TIOP # configured in a partition.These messages
(together or separately) indicate incorrect setting of the
lp_io_mask#
console EV.
The mask might be set to zero
or to the wrong IOP module slot number.
Correct the setting and retry the
lpinit
command.
Secondary partitions have already been started.This message most likely indicates you issued a second
lpinit
command after starting partitions.
Before booting the operating
system, check the values of the
lp_
console EVs.
CPU # not configured in any partition.No valid
primary processor specified for partition #.In this message, the CPU number (#) might be a single CPU
or a list of CPUs.
These messages (together or separately) indicate incorrect setting of
the
lp_cpu_mask#
console EV.
The mask
might be set to zero or to incorrect CPU numbers.
Correct the setting and
retry the
lpinit
command.
6.4 AlphaServer 1000 and 1000A Configuration Information
The following configuration restrictions are specific to AlphaServer
1000 and 1000A systems.
6.4.1 EISA Configuration Utility Version 1.10
This note applies to users of the embedded Cirrus VGA graphics controller.
The default setting for the VGA graphics controller when running the
EISA Configuration Utility (ECU) Version 1.10 is
Disabled.
For previous versions, the default is
Enabled.
When you run the ECU Version 1.10 for the first time on a system that
was previously configured with an earlier version of the ECU, the setting
for the embedded VGA graphics controller is automatically set to
Disabled.
To change the default value, run the ECU, select Step
3: View and edit details, and set the VGA graphic controller to
Enabled
before exiting.
If you do not set the VGA graphic controller
to
Enabled
prior to booting the operating system, your
X server will not start and your system will have generic console support
when you boot the operating system.
6.4.2 Graphics Resolution
The default graphics resolution for AlphaServer 1000A systems that contain built-in Cirrus video with 1 MB of video RAM is 1024x768. If the optional 512 KB of video RAM is not present, the operating system supports resolutions of 640x480 (by default) or 800x600 only.
The default graphics resolution for AlphaServer 1000 systems that contain built-in Cirrus video with 512 KB of video RAM is 640x480. This configuration also supports 800x600 resolution.
To use 800x600 resolution, edit the following line in the
/usr/lib/X11/xdm/Xservers
file:
:0 local /usr/bin/X11/X
Change the line to:
:0 local /usr/bin/X11/X "-screen0 800x600"
To use 800x600 resolution for the CDE Session Manager, edit the following
line in the
/usr/dt/config/Xservers
and
Xservers.conf
files:
:0 Local local@console /usr/bin/X11/X :0
Change the line to:
:0 Local local@console /usr/bin/X11/X :0 -screen0 800
Before editing these files for XDM or CDE, be sure that your system's
monitor supports 800x600 resolution.
6.5 AlphaServer GS-series Configuration Information
The following configuration restrictions are specific to AlphaServer
GS systems.
6.5.1 Possible OLAR Errors on Primary CPU
Starting with Version 5.1A of the operating system, the primary CPU
is capable of being taken off line and removed on AlphaServer GS80, GS160,
and GS320 systems.
When you take the primary CPU off line, another CPU is
automatically delegated the role of primary CPU.
Due to intermittent problems
with this operation, do not take the primary CPU off line at this time.
The
primary CPU is normally CPU0.
To verify which CPU is currently the primary,
use the
pset_info
command:
# pset_info number of processor sets on system = 1 pset_id # cpus # pids # threads load_av created 0 4 89 453 0.13 07/12/2001 17:25:28 total number of processors on system = 4 cpu # running primary_cpu pset_id assigned_to_pset 0 1 1 0 07/12/2001 17:25:28 1 1 0 0 07/12/2001 17:25:28 2 1 0 0 07/12/2001 17:25:28 3 1 0 0 07/12/2001 17:25:28
This output indicates that CPU0 is the primary CPU.
When you attempt to take the primary CPU off line, the operation will likely succeed, and a new primary CPU (for example CPU1) is automatically selected. However, when subsequently attempting to bring the previously assigned primary CPU back on line, you may encounter the following error:
Processor X failed to start console callback PARTITION, POWER-HW timed_out. wf_hal_pwr_ctl: call to prom_power failed with status [ffffffe6]
To clear this condition, you must initialize the system using the SRM
console.
This action requires shutting down the operating system.
This problem
will be fixed in a future release of the console firmware.
6.5.2 Do Not Repetitively Power Cycle CPUs
The use of OLAR management commands in continuous test loops can degrade the reliability of the CPU. These commands remove the DC power source from the CPU module. We recommend that you do not power cycle CPUs repetitively with shell scripts.
The CPU module's DC-to-DC converter is specified to have a maximum of
1000 power cycles.
Do not exceed his number of power cycles for a CPU.
6.5.3 Hot Add Restriction
This release of the operating system supports GS80, GS160, and GS320
CPU hot additions with the following restriction: A Quad Building Block (QBB)
booted without memory and without at least one CPU cannot have CPUs hot added
to that QBB.
Doing so will result in a system panic.
If the target QBB is
booted with memory and at least one CPU, additional CPUs can be hot added
as desired.
(This restriction will be lifted in a future kernel update.)
6.6 Personal Workstation 433au, 500au, and 600au Systems
The following configuration restrictions are specific to Personal Workstation
class systems.
6.6.1 64-Bit PCI Option Cards
The 64-bit PCI slots, slots 4 and 5, are intended only for those cards listed in the Systems and Options Catalog as supported for slots 4 and 5. The console prevents system operation and displays the following error if an unsupported card is present in one of these slots (n):
Illegal device detected on primary bus in physical slot n Power down the system and remove the unsupported device from slot n
6.6.2 Incorrect Default Keyboard Mappings
If you use a PCXLA-NA keyboard on a Personal Workstation 433au, 500au, or 600au class system, the keys will not map properly unless you reconfigure the keyboard driver to use the correct keymaps.
You can do this by executing the following command:
# sysconfig -r gpc_input kbd_scancode=2
If you prefer, you can use the
sysconfigdb
command to add
the following entry to the
/etc/sysconfigtab
file:
gpc_input: kbd_scancode = 2
If you use the
sysconfig
command
to reconfigure the driver, you must execute the command each time you reboot
the system.
Using the
sysconfigdb
utility to make the
change preserves the information across reboots, and no other user intervention
is required.