By default, the Tru64 UNIX operating system is installed with support for United States English. However, the system administrator can choose to install one or more Worldwide Language Support (WLS) subsets, which provide the tools and features that allow you to work in languages other than English. Language subsets can also be added after the base operating system installation. WLS installation and the language subsets available for installation are described in the operating system Installation Guide.
This chapter explains how to perform the various setup tasks and use
the software features of language environments other than English.
The chapter
assumes that you are familiar with the operating system in its default English-language
environment.
1.1 Overview of Using Internationalized Software
To enable input and display in any language other than English,
you must always set the locale in which your process runs.
Locales for particular
languages are installed as WLS subsets, and you can use the
locale
-a
command to display the available locales.
See
Section 1.3
for information on locales and how to set or change
locale settings.
If you have superuser privileges, you can use the Configure International Software utility from the SysMan Menu to set the default login locale and perform many other internationalization tasks. See Section 1.2 or the Configure International Software online help for information on using this utility.
Depending on the language, you may need to perform tasks in addition to setting a locale. This chapter describes how to perform the following tasks:
Select keyboard type (Section 1.4)
Define search paths for specialized data and executable files that are language specific (Section 1.5)
Apply printer-control characters, filters, and fonts that are appropriate for local language printers (Section 1.7)
Mail text in languages other than English (Section 1.8)
Display reference pages (Section 1.9)
Convert data files from onde codeset to another (Section 1.10)
Display, edit, and print text in languages other than English (Section 1.11)
This chapter discusses these topics as they apply to particular languages
or groups of languages.
For complete information about using the internationalization
features of applications that run in the Common Desktop Environment (CDE),
see
Chapter 3
and the Tru64 UNIX
CDE Companion
manual.
1.2 Configuring International Software
This chapter describes how to set locales, keyboard mappings, and other aspects of internationalization support using system utilities and commands. However, if you are a system manager or administrator with superuser privileges, you can use the Configure International Software utility to configure Worldwide Language Support on your system.
The Configure International Software utility is menu-oriented function available from the SysMan Menu under the Software option. As superuser, a system manager or administrator can use Configure International Software to perform the following tasks:
Configure access to Worldwide Language Support tools and libraries for individual accounts or system-wide.
Configure Asian terminal driver
support and merge that configuration into the system configuration file (/usr/sys/conf/
).
Using this task, the system administrator can
do the following:
Activate Asian codeset options, including traditional Chinese Big-5, Telecode for Taiwan, simplified Chinese to traditional Chinese conversion, UTF-8, and Thai language support.
Add UNIX Terminal Extension (UTX) support options to the Asian terminal driver. The UTX support options include on-demand font loading (ODL), Katakana conversion, and Software Phrase Input Method (SIM).
Establish dynamic or static linking of the Asian terminal driver to the kernel.
Specify the number of UTX pseudodevices that will be created on the system.
Rebuild the kernel.
Configure Wnn, the character-cell input method for Japanese.
Remove installed country support subsets (locales). If you do not have superuser privilege, you can view, but not delete, installed subsets.
Remove installed fonts. If you do not have superuser privilege, you can view, but not delete, installed fonts.
Establish a default login language, switch between dense code and Unicode locales, and choose an input method for a locale that supports multiple input methods.
View installed keyboard map files. You do not have to be superuser to view installed keymaps.
1.3 Setting Locale and Language
Locales are the method whereby the operating system implements localization. A locale establishes information within a computer system that is specific to each supported language, cultural data, and coded character set (codeset) combination. A locale provides information on the following:
Repertoire of available characters
Language-specific sorting rules
Language-specific rules and symbols for monetary and numeric data, date, and time
Path for translated message files, application resource files, help files, or some combination of these
To view the locales installed on your system, use the command
locale
-a
or use the Manage Locales option of the
Configure International Software utility.
See
l10n_intro
(5)locale
(4)
When Worldwide Language Support
is installed on your system, two types of locales are installed for localization
support: Unicode locales and dense code locales.
Unicode locales conform to
Unicode and ISO/IEC 10646 standards and use UTF-32 as the wide-character encoding.
Unicode locales whose names end in
UTF-8
use file and
internal processing code defined in the standards.
Other, non-UTF-8
Unicode locales use traditional UNIX and proprietary codesets
for the file code and use UTF-32 for internal process code.
A subset of these
locales have a
@ucs4
modifier; they are provided for
backward compatibility and are the same as the locales without
@ucs4
.
You cannot select
@ucs4
locales from
the CDE login menu; you must specify the locale name in the LANG environment
variable.
Dense code locales use dense code for wide-character encoding to minimize table size.
The distinction between dense code and Unicode locales is of interest to programmers and is described in the Writing Software for the International Market manual. For users of internationalized software on Tru64 UNIX, dense code locales are functionally equivalent to Unicode locales and a Unicode locale exists for each dense code locale. However, not all Unicode locales have a dense code version.
The
Unicode locales are installed in
/usr/i18n/lib/nls/ucsloc/
.
Dense code locales are installed in
/usr/i18n/lib/nls/loc
.
The active default is determined by the symbolic link,
/usr/i18n/lib/nls/dloc
.
If you are superuser, you can switch between Unicode and dense
code locales by changing the setting of the symbolic link, as described in
l10n_intro
(5)
To set a locale for system use, define the
LANG
environment variable as one of the installed locales.
For example, under the
C shell:
% setenv LANG en_US.ISO8859-1
This command sets the user environment to the values defined for United States English using the ISO8859-1 codeset. If a locale is not installed, internationalized applications assume the POSIX (C) locale, which supports only English.
See the discussion of internationalization in the
System Administration
manual and in the
Command and Shell User's Guide
for more detailed information on using
locales and defining the associated variables for system and user setup.
See
i18n_intro
(5)LANG
.
For graphical applications, you need to select a language to take advantage of the text translations and local language features available with the Common Desktop Environment (CDE) and other Motif applications. For Asian languages, the correct language selection is particularly important because it enables the following features:
Support for the appropriate input method in these applications
Entry of file names and other parameters that use ideographic characters
Cursor positioning on correct character and word boundaries
Line wrapping at correct word boundaries
See the CDE Companion manual for general information about setting language in CDE.
CDE assumes that all applications that run during a session operate in the language that was set at the start of the session. On Tru64 UNIX systems, you can work around this restriction with the following actions:
In a
dtterm
window, set the
LANG
or
LC_ALL
environment variable to the locale
in which you want to run the new application.
For example:
% setenv LANG ko_KR.deckorean
If the setting is for a Japanese, Chinese, or Korean locale, use the system command line to start the appropriate input method server before invoking the application. For example:
% /usr/bin/X11/dxhangulim &
See Section 2.1 for information about Asian input method servers.
In the same window as step 1, use the system command line to invoke the application you want to run in the new locale. For example:
% /usr/dt/bin/dtterm &
If you need to change your keyboard setting to work in the new locale, do so before starting to work in the new application's window. See Section 1.4 for information about setting keyboard type.
For English language input, a standard keyboard provides a sufficient number of keys (combined with shift states) to enter all uppercase and lowercase letters, numerals, and punctuation marks. For many other languages, the default keyboard does not provide enough keys and shift states to enter all characters.
Terminal users must use a localized keyboard or, if their keyboard includes a Compose key, use Compose-key sequences to enter non-English language characters from single-byte codesets. Some terminals also provide software emulation of a number of keyboard layouts for languages that are based on single-byte codesets. The user manual for each terminal explains how you can use its keyboard to enter non-English characters. Entry of multibyte characters in Asian languages requires special terminal hardware.
If the appropriate support files are installed on the system, workstation users can set the keyboard type to be appropriate for languages for which standard keyboard types exist. You must set the keyboard type for Western and Eastern European languages, Japanese, Thai, and Hebrew. However, you are not required to set the keyboard type for Chinese and Korean languages.
In CDE, use Keyboard Options (one of the desktop applications) to change your keyboard type. See the CDE Companion manual for more information about changing keyboard type.
From the
system command line, use the
dxkeyboard
command to invoke
Keyboard Options to choose a keyboard map and change the keyboard type.
Unlike the language setting, the keyboard setting is a global attribute that applies to all windows. Therefore, if you are working in windows that were created with different language settings, you may need to change the keyboard setting as you move from one window to another.
Keep in mind that
no matter what setting you make using CDE applications, that setting does
not change the setting that applies when you log on to the system.
The keyboard
setting when you log on to the system is always the system-default keyboard.
See
keyboard
(5)1.4.1 Determining Keyboard Layout
You can use an
xkbprint
command to access a keyboard
layout for your current keyboard setting.
For example, the following command
accesses the layout and creates a PostScript file that you can print:
% /usr/bin/X11/xkbprint -label symbols -o mykeyboard.ps :0
See
xkbprint
(1X)xkbprint
command.
If you change your keyboard from the one whose characters are printed on the hardware keys, you need to know how characters are mapped to keys and whether any characters must be entered by using a mode-switch key or key sequence. For some languages, such as Czech, up to four different characters can be mapped to the same key. In such cases, you use the key defined as the mode switch to toggle among different sets of characters mapped to the same key.
You can use the
dxkeycaps
command to display and edit keyboard mappings of the keyboard
attached to your workstation.
The display shows the keyboard, with keycaps
drawn according to the current server keymap.
Using the mouse, you can bring
up a menu of options, including the option to change the key symbol generated
by a particular key.
See
dxkeycaps
(1X)
Mode switching is a character entry
mechanism that is different from Compose sequences.
A particular keyboard
setting may support Compose sequences (which require one key to be defined
as a multikey), mode switching (which requires at least one key to be defined
as a mode-switch key), both, or neither of these input mechanisms.
1.4.2 Entering the Euro Currency Symbol
In 2002, the euro currency became the basic monetary unit in the European countries belonging to the Economic and Monetary Union (EMU).
To enter, display, and print the euro symbol, Worldwide Language Support (WLS) must be installed on your system, and you must perform the following steps:
Configure the system with supporting locales, keyboard mappings, and fonts.
Use the correct key sequences, codeset converters, and print filters.
This section describes these steps and provides examples of setting locales and selecting keyboard types.
For more information on using the euro currency symbol, see the Tru64 UNIX Best Practices Web page.
Support for the input, display, and printing of any symbol on the operating system requires a codeset that includes that symbol and a font set that can display the symbol. A keyboard mapping that associates keystrokes with the symbol is also useful, although Compose key and cut-and-paste alternatives also exist. The requirements of codeset, font set, and entry method apply whether the symbol is an English language letter, a Chinese language character, or the euro monetary symbol.
The Unicode (UTF-8) and ISO/IEC 8859-15 (Latin-9) codesets include the euro symbol. With WLS installed, the operating system provides these codesets by means of country-specific locales. The operating system also provides the keyboard mappings specific to each country and the Xfont library, which enables display of the euro symbol.
To enter and display the euro currency symbol, perform the following steps:
Run under a locale that supports the euro currency symbol. Table 1-1 lists the locales that support the euro symbol. To start one of these locales on your system execute the following steps:
From the Options Menu of the CDE login screen, choose Language.
From the Language Options Menu, choose a locale.
Choose a keyboard map that is appropriate for the selected euro-enabled locale and for the keyboard type you are using. To choose a keyboard map execute the following steps:
Enter
/usr/dt/bin/dxkeyboard
at the command
line to display the
dxkeyboard
Menu.
Alternatively, you can select Keyboard Options from the Desktop_Apps
Menu of the CDE Applications Manager to display the
dxkeyboard
Menu.
From the
dxkeyboard
Menu, choose the keyboard
map that matches the locale and keyboard type you are using.
The keyboard
type can usually be found on the underside of the keyboard.
The reference
page for the language you are using (for example,
Italian
(5)
Enter the euro currency symbol using the key combination described
for your current locale.
If your keyboard supports a Compose sequence, press
the Compose key followed by C and an equal sign (=) to generate the euro symbol.
(The appropriate Compose sequence keys for the euro symbol are described for
various locales in
euro
(5)
The Configure International Software utility available from the SysMan Menu provides system managers and administrators with an alternative way to manage locales and keymaps on the system. You must be superuser to use Configure International Software.
Table 1-1
is organized
by country and lists the locales that support the euro currency symbol and
the associated PC-style and VT-style key combination that will generate a
euro symbol.
The key combinations in this table are supported by xkb-format
keymaps, which are the default in CDE.
Table 1-1: Locale and Key Combination Summary
Country | Locale | Euro Symbol Input | |
VT-Style Keyboard Combination | PC-Style Keyboard Combination | ||
Catalan (Spain) |
|
Left Compose/E | Right Alt/E |
Chinese - PRC (simplified) |
|
There is no Chinese keyboard combination. Use the Qu-Wei Input Method to enter the Unicode value for euro (U+20AC), as described in the Tru64 UNIX Technical Reference for Using Chinese Features online manual. |
|
Danish (Denmark) |
|
Left Compose/E | Right Alt/E |
Dutch (Netherlands) |
|
Left Compose/E | Right Alt/E |
Dutch/Flemish (Belgium) |
|
Left Compose/E | Right Alt/E |
English (United Kingdom and Irish Republic) |
|
Left Compose/4 | Right Alt/4 |
English (United States) |
|
Left Compose/E | Right Alt/E |
Finnish (Finland) |
|
Left Compose/E | Right Alt/E |
French (France) |
|
Left Compose/E | Right Alt/E |
French (Belgium) |
|
Left Compose/E | Right Alt/E |
French (Canada) |
|
Left Compose/E | Right Alt/E |
French (Switzerland) |
|
Left Compose/E | Right Alt/E |
German (Germany) |
|
Left Compose/E | Right Alt/E |
German (Switzerland) |
|
Left Compose/E | Right Alt/E |
Icelandic (Iceland) |
|
Left Compose/E | Left Compose/E |
Italian (Italy) |
|
Left Compose/E | Right Alt/E |
Japanese (Japan) |
|
There is no Japanese Compose
table or keymap support for the euro symbol.
To enter the euro symbol, use
the
|
|
Korean (Korea) |
|
There is no Korean Compose table
or keymap support for the euro symbol.
To enter the euro symbol, use the
|
|
Norwegian (Norway) |
|
Left Compose/E | Right Alt/E |
Portuguese (Portugal) |
|
None | Right Alt/E |
Spanish (Spain) |
|
Left Compose/E | Right Alt/E |
Swedish (Sweden) |
|
Left Compose/E | Right Alt/E |
The Alternate function key, described as the Alt
key in this table, is also described as the Gr key on some keyboards.
(In
both cases, the key is on the right side of the keyboard.) For more information
on keyboard mappings and keyboards, see
keyboard
(5)
When you install WLS and languages that support the euro symbol, you
receive text and PostScript print filters that are sensitive to system locale
settings and that provide fonts containing the euro currency symbol.
For example,
the generic PostScript print filter (wwpsof
) supports UTF-8
and ISO 8859-15 formats.
No additional user action is required for euro symbol printing support.
The operating system provides two locales
(en_EU.UTF-8@euro
and
en_US.UTF-8@euro
)
that specifically assign the euro symbol in the
LC_MONETARY
section of the locale.
These locales supplement the
UTF-8
and
ISO8859-15
locales that define the currency symbol
as the euro, as described in
euro
(5)LC_MONETARY
overrides the environment variable
LANG
, you
can set
LANG
to a locale that does not support the euro
symbol and set
LC_MONETARY
to
en_EU.UTF-8@euro
to obtain euro symbol support.
For more information, see
euro
(5)i18n_intro
(5)1.5 Defining the Search Path for Specialized Components
European languages are supported
by data and executable files installed at system-default locations.
Asian
language support for some commands and programming libraries requires files
that are subordinate to the
/usr/i18n
directory.
These
files supplement or replace files in system-default locations.
When you install one or more of the Asian language subsets, the installation procedure makes the following adjustments to variable settings on a system-wide basis:
I18NPATH
The
I18NPATH
variable defines the location of files
that provide Asian language support and that are not in system default locations.
This variable is set to the following:
/usr/i18n
Your system administrator can choose to install files for Asian language
support at a location different from
/usr/i18n
; however,
the
/usr/i18n
directory must contain a link to the other
location.
PATH
The
PATH
variable points to the location of commands
and is set to the following:
$I18NPATH/usr/bin:$PATH
The
/etc/i18n_profile
file includes the
PATH
and
I18NPATH
variable assignments on a system-wide basis for Bourne and Korn
shell users.
For C shell users, the installation process includes the
/etc/i18n_login
file and the
/etc/csh.login
file to correctly set search paths for Hebrew and Asian languages.
Unless
specifically noted in descriptions of particular commands or utilities, individual
users do not need to change process-specific search paths to find localized
binaries and utilities.
1.6 Supporting User-Defined Characters
The national character sets for Japan, Taiwan, and China do not include some of the characters that can appear in Asian place and personal names. Such characters are defined by users and reside in site-specific databases. These databases are called user-defined character (UDC) databases. When you define ideographic characters, you must also define font glyphs, collating files, and other support files for the characters.
The
Writing Software for the International Market
manual provides details on how you set up and use
UDC databases and how to edit
cp_dirs
, the UDC database
configuration file.
1.7 Using Printer Interface Features That Support Local Languages
When you install the operating system and include language variant subsets, your printing subsystem is enhanced with the following features:
Two generic internationalized print filters,
pcfof
and
wwpsof
, that work with HP and
third-party printers
A set of print filters that support escape sequences used by local language printers
Entries in the
/etc/printcap
file to
support printer code conversion and on-demand loading of font files
An
lprsetup
command that lets you add entries
for local language printers to the
/etc/printcap
file
lp
,
lpr
,
lpc
,
lpq
,
lprm
, and
lpstat
commands that support additional options for printing and
printer control
PostScript outline or TrueType fonts that can be used by the
wwpsof
filter and other software
The following sections discuss these features.
1.7.1 Generic Internationalized Print Filters
The
pcfof
and
wwpsof
print filters enable use of HP printers, particularly
those for which no other printer-specific solution is described in this chapter.
You also need to use these filters if your printer is from another vendor.
Both of these filters rely on a printer customization file (.pcf
file) to supply certain device-specific information.
Operating
system software includes a basic set of
.pcf
files.
System
administrators can add more
.pcf
files to describe the
capabilities of additional printers used at your site.
1.7.1.1 pcfof Print Filter
The
pcfof
filter handles both PostScript printers and text printers,
such as the HP PCL printer.
For PostScript files, the filter
requires that the appropriate local language PostScript fonts be available
on the printer.
This restriction limits the filter's usefulness on many printers,
particularly for printing PostScript files that require Japanese fonts.
This
filter can be set up to do codeset conversion when the printer locale differs
from the one required for a text file print job.
The filter also has
.pcf
files that are appropriate to use for a number of third-party
text printers.
See
pcfof
(8)1.7.1.2 wwpsof Print Filter
The
wwpsof
filter is used only with PostScript printers.
The filter
converts the single-byte and multibyte characters used in an international
environment to printable PostScript output.
Thus, print jobs that include
local language characters can be printed on printers where local language
fonts are not resident.
To use this filter, the printer must support PostScript
Level 2 (or higher) or PostScript Level 1 with the composite font extension.
The PostScript fonts can be outline fonts installed on the system, TrueType fonts, or low-resolution bitmap fonts. TrueType fonts and low-resolution bitmap fonts are made available to the filter through an X font server, which requires that the X font server be running. In searching for fonts, the filter first attempts to use PostScript outline fonts. If those are not available, the filter uses high-resolution, rasterized, TrueType fonts. If those are not available, the filter uses low-resolution bitmap fonts.
The
wwpsof
filter
is sensitive to locale setting.
When processing a character, the filter determines
if the character is printable in the current locale and uses the codeset part
of the locale definition to find an appropriate font (outline, TrueType, or
low-resolution).
Except for file formats that include Byte Order Mark (that
is UTF-16 or UTF-32 format), you must set the locale appropriately before
printing files that contain characters in languages other than English.
For example, you can set up a printer configuration file for
use with the
wwpsof
print filter to convert bitmap fonts
for other locales to PostScript when printing files that use UTF-8 encoding.
Unicode includes characters for almost all languages, and any given font is
limited to a small subset of supported characters.
Therefore, you can customize
a
unicode conversion preference
entry in the printer configuration
file to specify a codeset look-up order favoring the fonts that are available
for the language of the text most frequently printed.
The
wwpsof
filter prints multilanguage text files
by first converting each character in the text file to a matching character
in a UNIX codeset for which fonts are available and then converting the file
to PostScript.
The filter can also print PostScript files that have been generated
by some CDE applications.
See
wwpsof
(8)1.7.2 Print Filters for Specific Local Language Printers
A print filter processes text data for a particular
model of printer.
The filter handles the device dependencies of the printer
and performs device accounting functions.
When each print job is complete,
the print filter writes an accounting record to the file specified by the
af
field of the printer's entry in the
/etc/printcap
file.
The
print filters for local language text printers can handle text files that
contain ASCII and local language characters, or output files created by the
nroff
command.
When processing
nroff
output,
the filter removes multibyte characters that extend beyond the page boundary
and translates
nroff
control sequences for underlining,
superscripting, and subscripting to control sequences appropriate for the
printer.
However, the filter does not support multiple
nroff
control sequences on the same character.
The PostScript print filters can print PostScript files in addition
to text and
nroff
output files.
A local language print filter can be the specified filter in both the
of
and
if
fields in the
/etc/printcap
file.
For general information on
/etc/printcap
entries, see the
System Administration
manual and
printcap
(4)i18n_printing
(5)Japanese
(5)
The following print filters process text data for Asian languages:
Language | Filter | Printer |
Japanese | la84of |
LA84-J |
Japanese | la86of |
LA86-J |
Japanese | la90of |
LA90-J |
Japanese | la280of |
LA280-J |
Japanese | la380of |
LA380-J |
Japanese | ln03jaof |
LN03-J |
Japanese | ln05jaof |
LN05-J |
Japanese | ln82rof
processes both PostScript
and text data |
LN82R |
Simplified Chinese | la88cof |
LA88-C |
Simplified Chinese | la380cbof |
LA380-CB |
Korean | la380kof |
LA380-K |
Korean | dl510kaof |
DL510-KA |
Traditional Chinese | cp382dof |
CP382-D |
Thai | thailpof |
EP1050+ |
1.7.3 Support for Local Language Printers in /etc/printcap
The
/etc/printcap
file describes characteristics of each printer on the system.
Printer characteristics are specified by symbol/value pairs, where each symbol
is a 2-character mnemonic.
Each time you submit a print job, the
lpd
printer daemon and printer spooling system uses information
in the
/etc/printcap
file to determine how that job is
handled.
Table 1-2
describes
/etc/printcap
symbols that are specific to
local language printer support.
See
printcap
(4)/etc/printcap
file.
See
Section 1.7.4
for an example of using the
lprsetup
command to add several of these options to the
/etc/printcap
file for a local language printer.
Table 1-2: Symbols in /etc/printcap File for Local Language Printers
Symbol | Type | Default | Description |
ya |
str |
None | Double-quoted list of keyword value assignments This assignment list specifies most of the printer
options related to country-specific support.
The option keywords, which are
explained following this table, include
|
yp |
str |
NULL |
Printer ID that conforms to the WoToTo Standard (for Thai printers) |
ys |
num |
NULL |
Size of the SoftODL character cache The
For optimal performance, the cache value specified for the
|
yt |
str |
fifo |
The SoftODL character replacement method The
|
The
ya
symbol is defined for printing
languages whose characters are not included in the Latin-1 character set.
The value assigned to the
ya
symbol is a quoted string
that can include one or more of the following keywords:
flocale=
locale_name
Specifies
the locale for interpretation of file text.
The print filter uses this locale
to validate characters in the text.
For an Asian language that is supported
by more than one codeset, a difference between the
flocale
and
plocale
values determines whether codeset conversion
is done before the file is printed.
If
flocale
is not specified,
the filter interprets the file in the current locale.
font=
font_name
Specifies the name of the outline font for
printing PostScript files.
This font must be appropriate for the specified
plocale
value.
line=
number_of_lines
Specifies the number of lines for each page.
When
used in combination with the
-w
flag of the
lpr
command, the line number can control the font size and orientation
of printed output.
odldb=
odl_database_path
Specifies the pathname of the SoftODL
database.
By default, the printer uses the system-wide database as specified
in the
cp_dirs
file.
odlstyle=
style-
Nx
N
Specifies the SoftODL font style and size to use,
for example,
normal-24x24
.
If
odlstyle
is not specified, the default style and size set for the system-wide database
is used.
onehalf
For the Thai language, specifies
that characters be printed on one-and-one-half lines, rather than three lines,
to produce more compressed and natural-looking output.
The
onehalf
keyword is valid only for the
thailpof
print
filter.
plocale=
locale_name
Specifies
the printer locale.
Print filters for some printers, such as the LA380-CB
printer, are country specific because the printer has built-in fonts that
are encoded in a particular codeset.
For these printers, you must specify
plocale
, and the codeset part of
locale_name
must match the codeset of the built-in fonts.
Other printers are generic and
suitable for printing files in a variety of languages.
For these printers,
you can use a generic print filter (such as
wwpsof
) and
do not need to specify a
plocale
value unless you want
to use a font that is not the default for the language being printed.
spcom
Enables space-compensation
mode for languages, such as Thai, that contain nonspacing characters.
These
characters can combine with other characters for display and therefore do
not occupy space.
Many of the existing tools that align text do not handle
nonspacing characters correctly.
If you want to print the Thai output that
these tools generate, you should specify the
spcom
keyword
to ensure proper text alignment in the printed file.
This keyword is valid
only when used with a Thai print filter or with the
th_TH.TACTIS
plocale
value.
tacdata=
tac_data_path
Specifies the location of the character code tables used with the
thailpof
print filter.
By default,
tac_data_path
is
/usr/lbin/tac_data
.
tm
Enables text morphing for printing Thai characters.
Text morphing replaces some characters with others to produce better printed
output.
See
Thai
(5)
1.7.4 Enhancements to Printer Configuration Software
The Printing selection on the SysMan
Menu is the desktop application that helps you add, delete, or change the
characteristics of the printers on your system.
The
lprsetup
utility is an alternative way to do these operations if your system is not
running CDE.
In both cases, the software performs necessary tasks, such as
creating the printer spooling directory, linking the appropriate filter to
the printer, and writing the entry for the printer in the
/etc/printcap
file.
You must be superuser to run
lprsetup
.
See
lprsetup.dat
(4)
Example 1-1
demonstrates how you use the
lprsetup
command to set up a local language printer, in this case
ln03s-ja
.
Example 1-1: Setting Up a Local Language Printer with lprsetup
# /usr/sbin/lprsetup [1] Tru64 UNIX Printer Setup Program Command < add modify delete exit view quit help >: add Adding printer entry, type '?' for help. Enter printer name to add [lp11] : [2] Printer Types: 1. Compaq Advanced Server ClientPS 2. Compaq Advanced Server ClientText 3. Compaq LN16 4. Compaq LN32 5. Digital Colormate PS 6. Digital DEClaser 1100 7. Digital DEClaser 1150 8. Digital DEClaser 2100 9. Digital DEClaser 2150 10. Digital DEClaser 2200 11. Digital DEClaser 2250 12. Digital DEClaser 2300 13. Digital DEClaser 2400 14. Digital DEClaser 3200 15. Digital DEClaser 3250 16. Digital DEClaser 3500 17. Digital DEClaser 5100 18. Digital LA100 19. Digital LA120 20. Digital LA210 21. Digital LA280 22. Digital LA30N 23. Digital LA30N A4 24. Digital LA30W 25. Digital LA30W A4 26. Digital LA324 27. Digital LA380 28. Digital LA380CB 29. Digital LA380K 30. Digital LA400 31. Digital LA424 32. Digital LA50 33. Digital LA600 34. Digital LA70 35. Digital LA75 36. Digital LA84 37. Digital LA86 38. Digital LA88 39. Digital LA88C 40. Digital LA90 41. Digital LG02 42. Digital LG04 Plus 43. Digital LG05 Plus Press 'ENTER' to continue scrolling, type '(q)uit' to end scrolling: q Help Types: ? - General help printer? - Specific printer type information Enter index number, help type, '(q)uit', or 'ENTER' [Generic Unknown type]:ln03ja [3]
.
.
.
Enter printer synonym: draft [4] Enter printer synonym: Set device pathname 'lp' [] ? /foo Do you want to capture print job accounting data ([y]|n)? n Set spooler directory 'sd' [/var/spool/printers/lpd11] ? Set printer error log file 'lf' [/var/adm/printers/lp11.lperr] ? Enter the name of the printcap symbol you wish to modify. Other valid entries are: 'q' to quit (no more changes) 'p' to print the symbols you have specified so far 'l' to list all of the possible symbols and defaults The names of the printcap symbols are: af br cf ct df dn du fc ff fo fs gf if lf lo lp mc mj mx nc nf of on pl pp ps pw px py rf rm rp rs rw sb sc sd sf sh st tf tr vf xc xf xn xs ya yd yj yp ys yt Enter symbol name: yt [5] Enter a new value for symbol 'yt'? [none] [Return] Enter symbol name: ? Enter the name of the printcap symbol you wish to modify. Other valid entries are: 'q' to quit (no more changes) 'p' to print the symbols you have specified so far 'l' to list all of the possible symbols and defaults The names of the printcap symbols are: af br cf ct df dn du fc ff fo fs gf if lf lo lp mc mj mx nc nf of on pl pp ps pw px py rf rm rp rs rw sb sc sd sf sh st tf tr vf xc xf xn xs ya yd yj yp ys yt Enter symbol name: q [6] Printer #11 ----------- Symbol type value ------ ---- ----- if STR /usr/lbin/ppdof +OPageSize=Letter +Ctektronix740.rpd lf STR /var/adm/printers/lp11.lperr lp STR /foo mx INT 0 of STR /usr/lbin/ppdof +OPageSize=Letter +Ctektronix740.rpd pl INT 66 pw INT 0 rw BOOL on sd STR /var/spool/printers/lpd11 xf STR /usr/lbin/xf Are these the final values for printer 11 ? [y] [Return] Adding comments to printcap file for new printer, type '?' for help. Do you want to add comments to the printcap file [n] ? : [Return] Setup activity is complete for this printer. Verify that the printer works properly by using the lpr(1) command to send files to the printer. Command < add modify delete exit view quit help >: e [7]
Invokes the
lprsetup
program.
[Return to example]
Displays the available printer types (see the reference pages for specific languages for information on the local language printers supported by the lrpsetup command). [Return to example]
Enters the printer type.
To obtain a description of individual
printers, enter
printer?
.
[Return to example]
The utility displays a series of prompts that allow you to specify a synonym for the printer name, device path, whether accounting data will be maintained, and selection of a spooler directory and error log. Enter a question mark to obtain help on any of the prompts. [Return to example]
Prompts you to enter a
printcap
symbol.
See
Table 1-2
for the symbols and parameters of
importance to internationalized systems.
For example,
ys
sets the cache size that the SoftODL service uses.
By default, this value
is the appropriate cache size for the printer and is stored as the value of
the
ys
symbol in the
/etc/printcap
file.
[Return to example]
Quits the
lprsetup
dialogue.
The utility
displays the values assigned and prompts for verification.
Once verified,
you are prompted to add comments to the
/etc/printcap
file.
[Return to example]
Quits the program to indicate no more changes are needed to
the
/etc/printcap
file.
[Return to example]
The
lp
,
lpc
,
lpd
,
lpq
,
lpr
,
lprm
, and
lpstat
commands
handle the features added to the print subsystem for Asian and other languages
not in the Latin-1 group.
For example, the
lpr
command
includes the
-A
option and additional values for the
-O
option to give you access to such features.
See
lpr
(1)1.7.6 Choosing PostScript Fonts for Different Locales
The fonts for the Chinese and Korean languages do not fit in the memory of most PostScript printers. Fonts for the Thai language and some European languages do fit in memory, but are large enough that they do not fit in printer memory along with fonts for other languages.
For PostScript printers in which language-specific
fonts are not printer resident, the
wwpsof
print filter
(see
Section 1.7.1.2) provides a solution.
In this case, you
specify in a printer's configuration file the names of the fonts you want
to use for different languages.
The
wwpsof
print filter
can also create PostScript output from bitmap fonts when PostScript fonts
are not available for a particular codeset.
See
wwpsof
(8)
The following list associates languages and codesets with the appropriate set of PostScript fonts:
Western
European, Latin-1 (*.ISO8859-1
)
PostScript fonts for Latin-1 languages are printer resident; they are not installed from software subsets.
Hungarian,
Czech, Slovak, Slovene (*.ISO8859-2
)
Arial-Bold-ISOLatin2 Arial-BoldItalic-ISOLatin2 Arial-Italic-ISOLatin2 Arial-ISOLatin2 ArialNarrow-Bold-ISOLatin2 ArialNarrow-BoldItalic-ISOLatin2 ArialNarrow-Italic-ISOLatin2 ArialNarrow-ISOLatin2 BookAntiqua-Bold-ISOLatin2 BookAntiqua-BoldItalic-ISOLatin2 BookAntiqua-Italic-ISOLatin2 BookAntiqua-ISOLatin2 BookmanOldStyle-Bold-ISOLatin2 BookmanOldStyle-BoldItalic-ISOLatin2 BookmanOldStyle-Italic-ISOLatin2 BookmanOldStyle-ISOLatin2 CenturyGothic-Bold-ISOLatin2 CenturyGothic-BoldItalic-ISOLatin2 CenturyGothic-Italic-ISOLatin2 CenturyGothic-ISOLatin2 CenturySchoolbook-Bold-ISOLatin2 CenturySchoolbook-BoldItalic-ISOLatin2 CenturySchoolbook-Italic-ISOLatin2 CenturySchoolbook-ISOLatin2 Courier-Bold-ISOLatin2 Courier-BoldItalic-ISOLatin2 Courier-Italic-ISOLatin2 Courier-ISOLatin2 MonotypeCorsiva-ISOLatin2 TimesNewRoman-Bold-ISOLatin2 TimesNewRoman-BoldItalic-ISOLatin2 TimesNewRoman-Italic-ISOLatin2 TimesNewRoman-ISOLatin2
Arial-Bold-ISOLatinCyrillic Arial-BoldInclined-ISOLatinCyrillic Arial-Inclined-ISOLatinCyrillic Arial-ISOLatinCyrillic Courier-Bold-ISOLatinCyrillic Courier-BoldInclined-ISOLatinCyrillic Courier-Inclined-ISOLatinCyrillic Courier-ISOLatinCyrillic Nimrod-Bold-ISOLatinCyrillic Nimrod-BoldInclined-ISOLatinCyrillic Nimrod-Inclined-ISOLatinCyrillic Nimrod-ISOLatinCyrillic Plantin-Bold-ISOLatinCyrillic Plantin-BoldInclined-ISOLatinCyrillic Plantin-Inclined-ISOLatinCyrillic Plantin-ISOLatinCyrillic TimesNewRoman-Bold-ISOLatinCyrillic TimesNewRoman-BoldInclined-ISOLatinCyrillic TimesNewRoman-Inclined-ISOLatinCyrillic TimesNewRoman-ISOLatinCyrillic
Arial-Bold-ISOLatinGreek Arial-BoldInclined-ISOLatinGreek Arial-Inclined-ISOLatinGreek Arial-ISOLatinGreek Courier-Bold-ISOLatinGreek Courier-BoldInclined-ISOLatinGreek Courier-Inclined-ISOLatinGreek Courier-ISOLatinGreek TimesNewRoman-Bold-ISOLatinGreek TimesNewRoman-BoldInclined-ISOLatinGreek TimesNewRoman-Inclined-ISOLatinGreek TimesNewRoman-ISOLatinGreek
David-Bold-ISOLatinHebrew David-BoldOblique-ISOLatinHebrew David-ISOLatinHebrew David-Oblique-ISOLatinHebrew FrankRuhl-Bold-ISOLatinHebrew FrankRuhl-BoldOblique-ISOLatinHebrew FrankRuhl-ISOLatinHebrew FrankRuhl-Oblique-ISOLatinHebrew Miriam-Bold-ISOLatinHebrew Miriam-BoldOblique-ISOLatinHebrew Miriam-ISOLatinHebrew Miriam-Oblique-ISOLatinHebrew MiriamFixed-Bold-ISOLatinHebrew MiriamFixed-BoldOblique-ISOLatinHebrew MiriamFixed-ISOLatinHebrew MiriamFixed-Oblique-ISOLatinHebrew NarkissTam-Bold-ISOLatinHebrew NarkissTam-BoldOblique-ISOLatinHebrew NarkissTam-ISOLatinHebrew NarkissTam-Oblique-ISOLatinHebrew
Arial-Bold-ISOLatin5 Arial-BoldItalic-ISOLatin5 Arial-Italic-ISOLatin5 Arial-ISOLatin5 ArialNarrow-Bold-ISOLatin5 ArialNarrow-BoldItalic-ISOLatin5 ArialNarrow-Italic-ISOLatin5 ArialNarrow-ISOLatin5 BookAntiqua-Bold-ISOLatin5 BookAntiqua-BoldItalic-ISOLatin5 BookAntiqua-Italic-ISOLatin5 BookAntiqua-ISOLatin5 BookmanOldStyle-Bold-ISOLatin5 BookmanOldStyle-BoldItalic-ISOLatin5 BookmanOldStyle-Italic-ISOLatin5 BookmanOldStyle-ISOLatin5 CenturyGothic-Bold-ISOLatin5 CenturyGothic-BoldItalic-ISOLatin5 CenturyGothic-Italic-ISOLatin5 CenturyGothic-ISOLatin5 CenturySchoolbook-Bold-ISOLatin5 CenturySchoolbook-BoldItalic-ISOLatin5 CenturySchoolbook-Italic-ISOLatin5 CenturySchoolbook-ISOLatin5 Courier-Bold-ISOLatin5 Courier-BoldItalic-ISOLatin5 Courier-Italic-ISOLatin5 Courier-ISOLatin5 MonotypeCorsiva-ISOLatin5 TimesNewRoman-Bold-ISOLatin5 TimesNewRoman-BoldItalic-ISOLatin5 TimesNewRoman-Italic-ISOLatin5 TimesNewRoman-ISOLatin5
No PostScript fonts are supplied for locales using Latin-9 encoding.
However, a default printer configuration file (PCF) is available for Latin-9.
When specified for the
wwpsof
print filter, this file enables
automatic conversion of Latin-9 bitmap fonts to PostScript.
See
wwpsof
(8)
The X locale database file used by applications running in the
universal.UTF-8
,
en_US.UTF-8
, or Asian locales
(Chinese, Japanese, Korean) contains font definitions that include all the
fonts used with the operating system.
This enables applications under
en_US.UTF-8
to display all the font characters installed with Worldwide
Language Support (WLS).
Applications under the Asian locales display all the
font characters installed with WLS, except for ISO8859-2, -4, -5, -7, -8, -9,
and TACTIS.
See
wwpsof
(8)Unicode
(5)
Traditional Chinese (*.dechanyu
).
The operating system provides the following traditional Chinese outline fonts for printing on PostScript printers and for display through Level II Display Postscript extension. For information on use of these fonts with PostScript printers, Display PostScript, or display through a rasterizer, see the Technical Reference for Using Chinese Features online manual.
Sung-Light-CNS11643 Hei-Light-CNS11643
Simplified
Chinese (*.dechanzi
).
The operating system provides the following simplified Chinese outline fonts for printing on PostScript printers and for display through Level II Display Postscript extension. For information on use of these fonts with PostScript printers, Display PostScript, or display through a rasterizer, see the Technical Reference for Using Chinese Features online manual.
XiSong-GB2312-80 Hei-GB2312-80
Munjo
PostScript fonts for the Japanese language are normally printer resident;
they are not installed from software subsets.
However, you can set up a printer
configuration file (PCF) with the
wwpsof
print filter to
convert Japanese bitmap fonts to PostScript for printing files that use Japanese
encoding.
See
wwpsof
(8)
AngsanaUPC-Bold AngsanaUPC-BoldItalic AngsanaUPC-Italic AngsanaUPC-Light CordiaUPC-Bold CordiaUPC-BoldItalic CordiaUPC-Italic CordiaUPC-Light EucrosiaUPC-Bold EucrosiaUPC-BoldItalic EucrosiaUPC-Italic EucrosiaUPC-Light FreesiaUPC-Bold FreesiaUPC-BoldItalic FreesiaUPC-Italic FreesiaUPC-Light IrisUPC-Bold IrisUPC-BoldItalic IrisUPC-Italic IrisUPC-Light JasmineUPC-Bold JasmineUPC-BoldItalic JasmineUPC-Italic JasmineUPC-Light KodchiangUPC-Bold KodchiangUPC-BoldItalic KodchiangUPC-Italic KodchiangUPC-Light LilyUPC-Bold LilyUPC-BoldItalic LilyUPC-Italic LilyUPC-Light WaterlilyUPC-Bold WaterlilyUPC-BoldItalic WaterlilyUPC-Italic WaterlilyUPC-Light YuccaUPC-Bold YuccaUPC-BoldItalic YuccaUPC-Italic YuccaUPC-Light
1.8 Using Mail in a Multilanguage Environment
The operating system provides enhanced versions of the following commands and utilities to handle languages based on multibyte-character codesets:
sendmail
mailx
MH (mail handler)
The following sections discuss enhancements to these components and
codeset conversion done by the
comsat
server.
See
sendmail
(8)mailx
(1)mh
(1)comsat
(8)1.8.1 The sendmail Utility
The
sendmail
utility,
which is a back end to several user commands, is configured by default to
support 8-bit data.
The configuration that supports 8-bit data is required
for multibyte character support.
See
sendmail
(8)1.8.2 The mailx Command and MH Commands
The
mailx
command and all applicable commands
in the MH system support the conversion of mail messages between the mail
interchange codeset (used to transfer messages to some hosts) and a user's
application codeset.
For example, if the mail interchange codeset is ISO-2022-JP
and the application codeset is eucJP, the
mailx
or
MH
command converts incoming messages to the Japanese EUC codeset
before displaying them.
To prevent data loss, when incoming messages are stored in mail folders, the messages are encoded in the codeset in which they are received. Codeset conversion takes place when you extract or display the messages.
To communicate mail interchange code information to other systems, outgoing messages include two additional header lines like the following:
Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-2022-JP
The
charset
field in the preceding example
specifies the mail interchange codeset, in this case, ISO-2022-JP.
This codeset
is an ISO 7-bit state-dependent codeset for Japanese characters.
Codesets
other than those that are part of the ISO standard are identified by the prefix
X-
in the codeset name.
For example, when DEC Hanyu is the codeset
used for mail interchange, the following header lines are included in outgoing
mail messages:
Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=X-dechanyu
The
mailx
command and
MH
commands
determine the application codeset and set the mail interchange codeset for
incoming and outgoing messages based on certain values.
The following lists
describe, in priority order of highest to lowest, the values these commands
use.
The application codeset is determined from one of the following:
The value of the
lang
component in the
$HOME/.mailrc
file (for
the
mailx
command) or the
$HOME/.mh_profile
file (for
MH
commands)
The mail interchange codeset applied to incoming messages is determined from one of the following:
The
charset
field in the mail
header, if additional header lines are present in the message
The codeset specified as the system-wide
mail interchange default in the
/usr/lib/mail-codesets
file
If you create this file, make sure it contains the name of a locale as the only entry.
If neither of the preceding values is available, codeset conversion does not occur.
The mail interchange codeset applied to outgoing messages is determined from one of the following:
The setting of the
excode
component as defined in the
$HOME/.mailrc
file (for
mailx
users) or the
$HOME/.mh_profile
file (for users of
MH
commands)
If a codeset is not determined for outgoing mail interchange, the mail is sent with no codeset identifier.
The
comsat
server, which notifies you of incoming mail messages, always attempts
to convert incoming mail messages from the mail interchange codeset to your
application codeset.
The following lists describe, in priority order of highest
to lowest, the values that the
comsat
server uses to determine
the codesets of the mail interchange and your application.
The mail interchange codeset value is determined from one of the following:
The codeset specified as the system-wide
mail interchange default in the
/usr/lib/mail-codesets
file
If neither of the preceding values is available, codeset conversion does not occur.
The application codeset value is determined from one of the following:
The application codeset
defined for the
atty
driver of the user's system
The codeset name
in the$HOME/.codeset_device_name
file, where
device_name
is the name of the terminal
device for the current session
1.9 Displaying Reference Pages in Languages Other Than English
As with the
operating system, internationalized applications frequently supply online
reference pages (manpages) to document the application and its components.
The operating system includes enhanced versions of the
nroff
,
tbl
, and
man
commands to support this requirement.
The
nroff
and
tbl
commands are
tools used primarily by programmers to create reference pages.
These commands
are described in the
Writing Software for the International Market
manual.
The
man
command formats and displays the reference page and can handle
multibyte characters in reference page files.
By default, the
man
command automatically searches for reference pages in the/usr/share/locale_name
/man
directory before searching the
/usr/share/man
and
/usr/local/man
directories.
Therefore, if the
LANG
environment variable is set to an installed locale and if reference
page translations are available for that locale, the
man
command automatically displays reference pages in the appropriate language.
In addition, the
man
command automatically applies codeset conversion (assuming the availability
of appropriate converters) when reference page translations for a particular
language are encoded in a codeset that does not match the codeset of the user's
locale.
See
man
(1)man
command search path and for more details about codeset conversion.
1.10 Converting Data Files from One Codeset to Another
Each locale is based on a specific codeset. Therefore, when an application uses a file whose data is coded in one codeset and runs in a locale based on another codeset, character interpretation may be meaningless. You may need to set the process environment to a particular locale and use a data file created with a codeset different from the one on which the locale is based. The data file in question might be appropriate for a given language and in a codeset different from your locale for one of the following reasons:
The data file might have been created on another vendor's system by using a locale based on a vendor-specific codeset. For example, the integration of PCs into the enterprise computing environment increases the likelihood that UNIX users need to process files for which the data encoding is in MS-DOS code page format.
The locale could be one of several UNIX locales that support the same Asian language, such as Japanese. Asian languages are typically supported by a variety of locales, each based on a different codeset.
The data file could be in Unicode transformation format (UTF-8, UTF-16, or UTF-32). If characters in this file are to be printed or displayed on the screen, they might need to be converted to encodings for which fonts are available.
You can convert a data file from one codeset to another by using the
iconv
command.
Consider the following example:
% iconv -f SJIS -t eucJP accounts_local \ >> accounts_central
This
iconv
command performs the following tasks:
Reads data in the
accounts_local
file,
which is encoded in the
SJIS
codeset
Converts the data to the
eucJP
codeset
Appends the results to the
accounts_central
file
An application programmer, on the other hand, might use the
iconv_open()
,
iconv()
, and
iconv_close()
functions for the same purpose.
Many commands and utilities, such
as the
man
command and internationalized print filters,
use the
iconv()
functions and associated converters to
perform codeset conversion on the user's behalf.
See the
Writing Software for the International Market
manual and
iconv
(3)iconv
and algorithmic and table converters
in internationalized programs.
1.11 Miscellaneous Base System Commands
The following list includes information about features and restrictions that apply when using traditional UNIX commands in local language environments:
The
file
command recognizes files encoded in Unicode
or ISO 10646 formats (16-bit UCS-2 or 32-bit UTF-32).
For other kinds of text
files, the command recognizes when the character encoding is valid for the
codeset of the current locale.
The
file
command also has
a
jfile
alias.
When you use this alias, the command recognizes
the most commonly used encodings for Japanese (DEC Kanji, Japanese EUC, Shift
JIS, and 7-bit JIS) regardless of the current locale setting.
For more information,
see
file
(1)
When you use the
rlogin
command to log on to a Tru64 UNIX
system from an ULTRIX system, be sure to specify the
-8
flag to pass 8-bit data without stripping.
Otherwise, you will have problems
entering non-ASCII characters from your terminal.
If you view
a large data file while logged on to the remote system, use a pager command,
such as
pg
, and not the Hold Screen key.
The
-8
option sets the terminal mode of the original host to RAW, disabling
flow control.
So, if data is sent to the terminal at a rate faster than the
terminal can handle it, some data is lost when you use the Hold Screen key.
This
rlogin
restriction applies when logging in from
an ULTRIX system or when logging in from any UNIX system whose software does
not fully support 8-bit data format.
The operating system includes the multilingual Emacs software from the
Free Software Foundation.
Before using this editor, you must add the
/usr/i18n/mule/bin
directory to your process-specific search path.
You can then invoke this editor by using the
mule
command.
The
vi
and
more
commands discard
text that follows an invalid multibyte character.
If you encounter this problem,
it is likely that your locale setting is not correct for the text being viewed
or edited.
In this case, reset your locale to one that matches the text and
invoke the command again.
When used with Thai characters, the
vi
editor may
wrap lines before the right boundary of the screen.
This happens because Thai
text includes nonspacing characters, which contribute to the character count
but not to the display width.
The editor wraps lines based on character count.
For example,
vi
may wrap a line after entry of 80 characters,
even though these characters do not occupy 80 columns on the screen.
Using local language user names and file names
It
is a limitation of UNIX file systems that you cannot use a multibyte character
whose second or subsequent byte is an ASCII slash (/
) in
names of files, users, or other objects.
This limitation means that some user-defined
characters in the DEC Hanzi and DEC Kanji codesets and certain characters
(CNS Plane 2 characters) in the DEC Hanyu codeset cannot be used in these
names.