1 Working in a Multilanguage Environment

By default, the Tru64 UNIX operating system is installed with support for United States English. However, the system administrator can choose to install one or more Worldwide Language Support (WLS) subsets, which provide the tools and features that allow you to work in languages other than English. Language subsets can also be added after the base operating system installation. WLS installation and the language subsets available for installation are described in the operating system Installation Guide.

This chapter explains how to perform the various setup tasks and use the software features of language environments other than English. The chapter assumes that you are familiar with the operating system in its default English-language environment.

1.1 Overview of Using Internationalized Software

To enable input and display in any language other than English, you must always set the locale in which your process runs. Locales for particular languages are installed as WLS subsets, and you can use the locale -a command to display the available locales. See Section 1.3 for information on locales and how to set or change locale settings.

If you have superuser privileges, you can use the Configure International Software utility from the SysMan Menu to set the default login locale and perform many other internationalization tasks. See Section 1.2 or the Configure International Software online help for information on using this utility.

Depending on the language, you may need to perform tasks in addition to setting a locale. This chapter describes how to perform the following tasks:

Select keyboard type (Section 1.4)

Define search paths for specialized data and executable files that are language specific (Section 1.5)

Apply printer-control characters, filters, and fonts that are appropriate for local language printers (Section 1.7)

Mail text in languages other than English (Section 1.8)

Display reference pages (Section 1.9)

Convert data files from onde codeset to another (Section 1.10)

Display, edit, and print text in languages other than English (Section 1.11)

This chapter discusses these topics as they apply to particular languages or groups of languages. For complete information about using the internationalization features of applications that run in the Common Desktop Environment (CDE), see Chapter 3 and the Tru64 UNIX CDE Companion manual.

1.2 Configuring International Software

This chapter describes how to set locales, keyboard mappings, and other aspects of internationalization support using system utilities and commands. However, if you are a system manager or administrator with superuser privileges, you can use the Configure International Software utility to configure Worldwide Language Support on your system.

The Configure International Software utility is menu-oriented function available from the SysMan Menu under the Software option. As superuser, a system manager or administrator can use Configure International Software to perform the following tasks:

Configure access to Worldwide Language Support tools and libraries for individual accounts or system-wide.

Configure Asian terminal driver support and merge that configuration into the system configuration file (/usr/sys/conf/). Using this task, the system administrator can do the following:
- Activate Asian codeset options, including traditional Chinese Big-5, Telecode for Taiwan, simplified Chinese to traditional Chinese conversion, UTF-8, and Thai language support.
- Add UNIX Terminal Extension (UTX) support options to the Asian terminal driver. The UTX support options include on-demand font loading (ODL), Katakana conversion, and Software Phrase Input Method (SIM).
- Define the Pseudo Terminal Driver protocol.
- Establish dynamic or static linking of the Asian terminal driver to the kernel.
- Specify the number of UTX pseudodevices that will be created on the system.
- Rebuild the kernel.

Configure Wnn, the character-cell input method for Japanese.

Remove installed country support subsets (locales). If you do not have superuser privilege, you can view, but not delete, installed subsets.

Remove installed fonts. If you do not have superuser privilege, you can view, but not delete, installed fonts.

Establish a default login language, switch between dense code and Unicode locales, and choose an input method for a locale that supports multiple input methods.

View installed keyboard map files. You do not have to be superuser to view installed keymaps.

1.3 Setting Locale and Language

Locales are the method whereby the operating system implements localization. A locale establishes information within a computer system that is specific to each supported language, cultural data, and coded character set (codeset) combination. A locale provides information on the following:

Repertoire of available characters

Language-specific sorting rules

Language-specific rules and symbols for monetary and numeric data, date, and time

Path for translated message files, application resource files, help files, or some combination of these

To view the locales installed on your system, use the command locale -a or use the Manage Locales option of the Configure International Software utility.

See l10n_intro(5) for information on the languages and codesets that the operating system supports with locales. See locale(4) for information on the contents of a locale. This section describes the two types of locales (dense code and Unicode) that Tru64 UNIX offers and how you establish a locale on the operating system.

When Worldwide Language Support is installed on your system, two types of locales are installed for localization support: Unicode locales and dense code locales. Unicode locales conform to Unicode and ISO/IEC 10646 standards and use UTF-32 as the wide-character encoding. Unicode locales whose names end in UTF-8 use file and internal processing code defined in the standards. Other, non-UTF-8 Unicode locales use traditional UNIX and proprietary codesets for the file code and use UTF-32 for internal process code. A subset of these locales have a @ucs4 modifier; they are provided for backward compatibility and are the same as the locales without @ucs4. You cannot select @ucs4 locales from the CDE login menu; you must specify the locale name in the LANG environment variable.

Dense code locales use dense code for wide-character encoding to minimize table size.

The distinction between dense code and Unicode locales is of interest to programmers and is described in the Writing Software for the International Market manual. For users of internationalized software on Tru64 UNIX, dense code locales are functionally equivalent to Unicode locales and a Unicode locale exists for each dense code locale. However, not all Unicode locales have a dense code version.

The Unicode locales are installed in /usr/i18n/lib/nls/ucsloc/. Dense code locales are installed in /usr/i18n/lib/nls/loc. The active default is determined by the symbolic link, /usr/i18n/lib/nls/dloc. If you are superuser, you can switch between Unicode and dense code locales by changing the setting of the symbolic link, as described in l10n_intro(5) or you can use the Configure International Software utility from the SysMan Menu. See the online help for Configure International Software for more information.

To set a locale for system use, define the LANG environment variable as one of the installed locales. For example, under the C shell:

% setenv LANG en_US.ISO8859-1

This command sets the user environment to the values defined for United States English using the ISO8859-1 codeset. If a locale is not installed, internationalized applications assume the POSIX (C) locale, which supports only English.

See the discussion of internationalization in the System Administration manual and in the Command and Shell User's Guide for more detailed information on using locales and defining the associated variables for system and user setup. See i18n_intro(5) for a discussion of locale variables such as LANG.

For graphical applications, you need to select a language to take advantage of the text translations and local language features available with the Common Desktop Environment (CDE) and other Motif applications. For Asian languages, the correct language selection is particularly important because it enables the following features:

Support for the appropriate input method in these applications

Entry of file names and other parameters that use ideographic characters

Cursor positioning on correct character and word boundaries

Line wrapping at correct word boundaries

See the CDE Companion manual for general information about setting language in CDE.

CDE assumes that all applications that run during a session operate in the language that was set at the start of the session. On Tru64 UNIX systems, you can work around this restriction with the following actions:

In a dtterm window, set the LANG or LC_ALL environment variable to the locale in which you want to run the new application. For example:
```
% setenv LANG ko_KR.deckorean
```

If the setting is for a Japanese, Chinese, or Korean locale, use the system command line to start the appropriate input method server before invoking the application. For example:
```
% /usr/bin/X11/dxhangulim &
```
See Section 2.1 for information about Asian input method servers.

In the same window as step 1, use the system command line to invoke the application you want to run in the new locale. For example:
```
% /usr/dt/bin/dtterm &
```

If you need to change your keyboard setting to work in the new locale, do so before starting to work in the new application's window. See Section 1.4 for information about setting keyboard type.

1.4 Selecting Keyboard Type

For English language input, a standard keyboard provides a sufficient number of keys (combined with shift states) to enter all uppercase and lowercase letters, numerals, and punctuation marks. For many other languages, the default keyboard does not provide enough keys and shift states to enter all characters.

Terminal users must use a localized keyboard or, if their keyboard includes a Compose key, use Compose-key sequences to enter non-English language characters from single-byte codesets. Some terminals also provide software emulation of a number of keyboard layouts for languages that are based on single-byte codesets. The user manual for each terminal explains how you can use its keyboard to enter non-English characters. Entry of multibyte characters in Asian languages requires special terminal hardware.

If the appropriate support files are installed on the system, workstation users can set the keyboard type to be appropriate for languages for which standard keyboard types exist. You must set the keyboard type for Western and Eastern European languages, Japanese, Thai, and Hebrew. However, you are not required to set the keyboard type for Chinese and Korean languages.

In CDE, use Keyboard Options (one of the desktop applications) to change your keyboard type. See the CDE Companion manual for more information about changing keyboard type.

From the system command line, use the dxkeyboard command to invoke Keyboard Options to choose a keyboard map and change the keyboard type.

Unlike the language setting, the keyboard setting is a global attribute that applies to all windows. Therefore, if you are working in windows that were created with different language settings, you may need to change the keyboard setting as you move from one window to another.

Keep in mind that no matter what setting you make using CDE applications, that setting does not change the setting that applies when you log on to the system. The keyboard setting when you log on to the system is always the system-default keyboard. See keyboard(5) for information about changing the system-default keyboard.

1.4.1 Determining Keyboard Layout

You can use an xkbprint command to access a keyboard layout for your current keyboard setting. For example, the following command accesses the layout and creates a PostScript file that you can print:

% /usr/bin/X11/xkbprint -label symbols -o mykeyboard.ps :0

See xkbprint(1X) for more information about the xkbprint command.

If you change your keyboard from the one whose characters are printed on the hardware keys, you need to know how characters are mapped to keys and whether any characters must be entered by using a mode-switch key or key sequence. For some languages, such as Czech, up to four different characters can be mapped to the same key. In such cases, you use the key defined as the mode switch to toggle among different sets of characters mapped to the same key.

You can use the dxkeycaps command to display and edit keyboard mappings of the keyboard attached to your workstation. The display shows the keyboard, with keycaps drawn according to the current server keymap. Using the mouse, you can bring up a menu of options, including the option to change the key symbol generated by a particular key. See dxkeycaps(1X) for more information on command options.

Mode switching is a character entry mechanism that is different from Compose sequences. A particular keyboard setting may support Compose sequences (which require one key to be defined as a multikey), mode switching (which requires at least one key to be defined as a mode-switch key), both, or neither of these input mechanisms.

1.4.2 Entering the Euro Currency Symbol

In 2002, the euro currency became the basic monetary unit in the European countries belonging to the Economic and Monetary Union (EMU).

To enter, display, and print the euro symbol, Worldwide Language Support (WLS) must be installed on your system, and you must perform the following steps:

Configure the system with supporting locales, keyboard mappings, and fonts.

Use the correct key sequences, codeset converters, and print filters.

This section describes these steps and provides examples of setting locales and selecting keyboard types.

For more information on using the euro currency symbol, see the Tru64 UNIX Best Practices Web page.

Support for the input, display, and printing of any symbol on the operating system requires a codeset that includes that symbol and a font set that can display the symbol. A keyboard mapping that associates keystrokes with the symbol is also useful, although Compose key and cut-and-paste alternatives also exist. The requirements of codeset, font set, and entry method apply whether the symbol is an English language letter, a Chinese language character, or the euro monetary symbol.

The Unicode (UTF-8) and ISO/IEC 8859-15 (Latin-9) codesets include the euro symbol. With WLS installed, the operating system provides these codesets by means of country-specific locales. The operating system also provides the keyboard mappings specific to each country and the Xfont library, which enables display of the euro symbol.

To enter and display the euro currency symbol, perform the following steps:

Run under a locale that supports the euro currency symbol. Table 1-1 lists the locales that support the euro symbol. To start one of these locales on your system execute the following steps:
1. From the Options Menu of the CDE login screen, choose Language.
2. From the Language Options Menu, choose a locale.

Choose a keyboard map that is appropriate for the selected euro-enabled locale and for the keyboard type you are using. To choose a keyboard map execute the following steps:
1. Enter /usr/dt/bin/dxkeyboard at the command line to display the dxkeyboard Menu.
  Alternatively, you can select Keyboard Options from the Desktop_Apps Menu of the CDE Applications Manager to display the dxkeyboard Menu.
2. From the dxkeyboard Menu, choose the keyboard map that matches the locale and keyboard type you are using. The keyboard type can usually be found on the underside of the keyboard. The reference page for the language you are using (for example, Italian(5)) describes associated keyboard types and maps for specific locales.

Enter the euro currency symbol using the key combination described for your current locale. If your keyboard supports a Compose sequence, press the Compose key followed by C and an equal sign (=) to generate the euro symbol. (The appropriate Compose sequence keys for the euro symbol are described for various locales in euro(5).)

The Configure International Software utility available from the SysMan Menu provides system managers and administrators with an alternative way to manage locales and keymaps on the system. You must be superuser to use Configure International Software.

Table 1-1 is organized by country and lists the locales that support the euro currency symbol and the associated PC-style and VT-style key combination that will generate a euro symbol. The key combinations in this table are supported by xkb-format keymaps, which are the default in CDE.

Table 1-1: Locale and Key Combination Summary

Country	Locale	Euro Symbol Input
		VT-Style Keyboard Combination	PC-Style Keyboard Combination
Catalan (Spain)	`ca_ES.UTF-8` `ca_ES.ISO8859-15`	Left Compose/E	Right Alt/E
Chinese - PRC (simplified)	`zh_CN.UTF-8` (simplified) `zh_HK.UTF-8` (traditional, Hong Kong) `zh_TW.UTF-8` (traditional, Taiwan)	There is no Chinese keyboard combination. Use the Qu-Wei Input Method to enter the Unicode value for euro (U+20AC), as described in the Tru64 UNIX Technical Reference for Using Chinese Features online manual.
Danish (Denmark)	`da_DK.UTF-8` `da_DK.ISO8859-15`	Left Compose/E	Right Alt/E
Dutch (Netherlands)	`nl_NL.UTF-8` `nl_NL.ISO8859-15`	Left Compose/E	Right Alt/E
Dutch/Flemish (Belgium)	`nl_BE.UTF-8` `nl_BE.ISO8859-15`	Left Compose/E	Right Alt/E
English (United Kingdom and Irish Republic)	`en_GB.UTF-8` `en_GB.ISO8859-15`	Left Compose/4	Right Alt/4
English (United States)	`en_US.UTF-8` `en_US.ISO8859-15`	Left Compose/E	Right Alt/E
Finnish (Finland)	`fi_FI.UTF-8` `fi_FI.ISO8859-15`	Left Compose/E	Right Alt/E
French (France)	`fr_FR.UTF-8` `fr_FR.ISO8859-15`	Left Compose/E	Right Alt/E
French (Belgium)	`fr_BE.UTF-8` `fr_BE.ISO8859-15`	Left Compose/E	Right Alt/E
French (Canada)	`fr_CA.UTF-8` `fr_CA.ISO8859-15`	Left Compose/E	Right Alt/E
French (Switzerland)	`fr_CH.UTF-8` `fr_CH.ISO8859-15`	Left Compose/E	Right Alt/E
German (Germany)	`de_DE.UTF-8` `de_DE.ISO8859-15`	Left Compose/E	Right Alt/E
German (Switzerland)	`de_CH.UTF-8` `de_CH.ISO8859-15`	Left Compose/E	Right Alt/E
Icelandic (Iceland)	`is_IS.UTF-8` `is_IS.ISO8859-15`	Left Compose/E	Left Compose/E
Italian (Italy)	`it_IT.UTF-8` `it_IT.ISO8859-15`	Left Compose/E	Right Alt/E
Japanese (Japan)	`ja_JP.UTF-8`	There is no Japanese Compose table or keymap support for the euro symbol. To enter the euro symbol, use the `vi` or `dtpad` editor to cut the symbol from a supporting application and paste it to the target application under the `ja_JP.UTF-8` locale.
Korean (Korea)	`ko_KO.UTF-8`	There is no Korean Compose table or keymap support for the euro symbol. To enter the euro symbol, use the `vi` or `dtpad` editor to cut the symbol from a supporting application and paste it to the target application under the `ko_KO.UTF-8` locale.
Norwegian (Norway)	`no_NO.UTF-8` `no_NO.ISO8859-15`	Left Compose/E	Right Alt/E
Portuguese (Portugal)	`pt_PT.UTF-8` `pt_PT.ISO8859-15`	None	Right Alt/E
Spanish (Spain)	`es_ES.UTF-8` `es_ES.ISO8859-15`	Left Compose/E	Right Alt/E
Swedish (Sweden)	`sv_SE.UTF-8` `sv_SE.ISO8859-15`	Left Compose/E	Right Alt/E

The Alternate function key, described as the Alt key in this table, is also described as the Gr key on some keyboards. (In both cases, the key is on the right side of the keyboard.) For more information on keyboard mappings and keyboards, see keyboard(5) for your version of the operating system ( http://www.tru64unix.compaq.com/docs/).

When you install WLS and languages that support the euro symbol, you receive text and PostScript print filters that are sensitive to system locale settings and that provide fonts containing the euro currency symbol. For example, the generic PostScript print filter (wwpsof) supports UTF-8 and ISO 8859-15 formats.

No additional user action is required for euro symbol printing support.

The operating system provides two locales (en_EU.UTF-8@euro and en_US.UTF-8@euro) that specifically assign the euro symbol in the LC_MONETARY section of the locale. These locales supplement the UTF-8 and ISO8859-15 locales that define the currency symbol as the euro, as described in euro(5). Because setting LC_MONETARY overrides the environment variable LANG, you can set LANG to a locale that does not support the euro symbol and set LC_MONETARY to en_EU.UTF-8@euro to obtain euro symbol support. For more information, see euro(5). Also, see i18n_intro(5) for more information on locale-related environment variables.

1.5 Defining the Search Path for Specialized Components

European languages are supported by data and executable files installed at system-default locations. Asian language support for some commands and programming libraries requires files that are subordinate to the /usr/i18n directory. These files supplement or replace files in system-default locations.

When you install one or more of the Asian language subsets, the installation procedure makes the following adjustments to variable settings on a system-wide basis:

I18NPATH
The I18NPATH variable defines the location of files that provide Asian language support and that are not in system default locations. This variable is set to the following:
/usr/i18n
Your system administrator can choose to install files for Asian language support at a location different from /usr/i18n; however, the /usr/i18n directory must contain a link to the other location.

PATH
The PATH variable points to the location of commands and is set to the following:
$I18NPATH/usr/bin:$PATH

The /etc/i18n_profile file includes the PATH and I18NPATH variable assignments on a system-wide basis for Bourne and Korn shell users. For C shell users, the installation process includes the /etc/i18n_login file and the /etc/csh.login file to correctly set search paths for Hebrew and Asian languages. Unless specifically noted in descriptions of particular commands or utilities, individual users do not need to change process-specific search paths to find localized binaries and utilities.

1.6 Supporting User-Defined Characters

The national character sets for Japan, Taiwan, and China do not include some of the characters that can appear in Asian place and personal names. Such characters are defined by users and reside in site-specific databases. These databases are called user-defined character (UDC) databases. When you define ideographic characters, you must also define font glyphs, collating files, and other support files for the characters.

The Writing Software for the International Market manual provides details on how you set up and use UDC databases and how to edit cp_dirs, the UDC database configuration file.

1.7 Using Printer Interface Features That Support Local Languages

When you install the operating system and include language variant subsets, your printing subsystem is enhanced with the following features:

Two generic internationalized print filters, pcfof and wwpsof, that work with HP and third-party printers

A set of print filters that support escape sequences used by local language printers

Entries in the /etc/printcap file to support printer code conversion and on-demand loading of font files

An lprsetup command that lets you add entries for local language printers to the /etc/printcap file

lp, lpr, lpc, lpq, lprm, and lpstat commands that support additional options for printing and printer control

PostScript outline or TrueType fonts that can be used by the wwpsof filter and other software

The following sections discuss these features.

1.7.1 Generic Internationalized Print Filters

The pcfof and wwpsof print filters enable use of HP printers, particularly those for which no other printer-specific solution is described in this chapter. You also need to use these filters if your printer is from another vendor. Both of these filters rely on a printer customization file (.pcf file) to supply certain device-specific information. Operating system software includes a basic set of .pcf files. System administrators can add more .pcf files to describe the capabilities of additional printers used at your site.

1.7.1.1 pcfof Print Filter

The pcfof filter handles both PostScript printers and text printers, such as the HP PCL printer. For PostScript files, the filter requires that the appropriate local language PostScript fonts be available on the printer. This restriction limits the filter's usefulness on many printers, particularly for printing PostScript files that require Japanese fonts. This filter can be set up to do codeset conversion when the printer locale differs from the one required for a text file print job. The filter also has .pcf files that are appropriate to use for a number of third-party text printers. See pcfof(8) and the System Administration manual for details on using this print filter.

1.7.1.2 wwpsof Print Filter

The wwpsof filter is used only with PostScript printers. The filter converts the single-byte and multibyte characters used in an international environment to printable PostScript output. Thus, print jobs that include local language characters can be printed on printers where local language fonts are not resident. To use this filter, the printer must support PostScript Level 2 (or higher) or PostScript Level 1 with the composite font extension.

The PostScript fonts can be outline fonts installed on the system, TrueType fonts, or low-resolution bitmap fonts. TrueType fonts and low-resolution bitmap fonts are made available to the filter through an X font server, which requires that the X font server be running. In searching for fonts, the filter first attempts to use PostScript outline fonts. If those are not available, the filter uses high-resolution, rasterized, TrueType fonts. If those are not available, the filter uses low-resolution bitmap fonts.

The wwpsof filter is sensitive to locale setting. When processing a character, the filter determines if the character is printable in the current locale and uses the codeset part of the locale definition to find an appropriate font (outline, TrueType, or low-resolution). Except for file formats that include Byte Order Mark (that is UTF-16 or UTF-32 format), you must set the locale appropriately before printing files that contain characters in languages other than English.

For example, you can set up a printer configuration file for use with the wwpsof print filter to convert bitmap fonts for other locales to PostScript when printing files that use UTF-8 encoding. Unicode includes characters for almost all languages, and any given font is limited to a small subset of supported characters. Therefore, you can customize a unicode conversion preference entry in the printer configuration file to specify a codeset look-up order favoring the fonts that are available for the language of the text most frequently printed.

The wwpsof filter prints multilanguage text files by first converting each character in the text file to a matching character in a UNIX codeset for which fonts are available and then converting the file to PostScript. The filter can also print PostScript files that have been generated by some CDE applications.

See wwpsof(8) and the System Administration manual for details on using this print filter.

1.7.2 Print Filters for Specific Local Language Printers

A print filter processes text data for a particular model of printer. The filter handles the device dependencies of the printer and performs device accounting functions. When each print job is complete, the print filter writes an accounting record to the file specified by the af field of the printer's entry in the /etc/printcap file.

The print filters for local language text printers can handle text files that contain ASCII and local language characters, or output files created by the nroff command. When processing nroff output, the filter removes multibyte characters that extend beyond the page boundary and translates nroff control sequences for underlining, superscripting, and subscripting to control sequences appropriate for the printer. However, the filter does not support multiple nroff control sequences on the same character.

The PostScript print filters can print PostScript files in addition to text and nroff output files.

A local language print filter can be the specified filter in both the of and if fields in the /etc/printcap file. For general information on /etc/printcap entries, see the System Administration manual and printcap(4). Supplementary information is provided in i18n_printing(5). A reference page for a specific language (for example, Japanese(5)) lists the names of print filters that support printing characters in that language.

The following print filters process text data for Asian languages:

Language	Filter	Printer
Japanese	`la84of`	LA84-J
Japanese	`la86of`	LA86-J
Japanese	`la90of`	LA90-J
Japanese	`la280of`	LA280-J
Japanese	`la380of`	LA380-J
Japanese	`ln03jaof`	LN03-J
Japanese	`ln05jaof`	LN05-J
Japanese	`ln82rof` processes both PostScript and text data	LN82R
Simplified Chinese	`la88cof`	LA88-C
Simplified Chinese	`la380cbof`	LA380-CB
Korean	`la380kof`	LA380-K
Korean	`dl510kaof`	DL510-KA
Traditional Chinese	`cp382dof`	CP382-D
Thai	`thailpof`	EP1050+

1.7.3 Support for Local Language Printers in /etc/printcap

The /etc/printcap file describes characteristics of each printer on the system. Printer characteristics are specified by symbol/value pairs, where each symbol is a 2-character mnemonic. Each time you submit a print job, the lpd printer daemon and printer spooling system uses information in the /etc/printcap file to determine how that job is handled.

Table 1-2 describes /etc/printcap symbols that are specific to local language printer support. See printcap(4) for descriptions of other symbols used in the /etc/printcap file. See Section 1.7.4 for an example of using the lprsetup command to add several of these options to the /etc/printcap file for a local language printer.

Table 1-2: Symbols in /etc/printcap File for Local Language Printers

Symbol	Type	Default	Description
`ya`	`str`	None	Double-quoted list of keyword value assignments This assignment list specifies most of the printer options related to country-specific support. The option keywords, which are explained following this table, include `flocale`, `font`, `line`, `odldb`, `odlstyle`, `onehalf`, `plocale`, `spcom`, `tacdata`, and `tm`.
`yp`	`str`	`NULL`	Printer ID that conforms to the WoToTo Standard (for Thai printers)
`ys`	`num`	`NULL`	Size of the SoftODL character cache The `ys` entry is applied to text print filters. It must be present and its value must be greater than zero to enable on-demand loading of font files. These font files are the ODL support files created by the `cgen` utility for user-defined characters. The location of the SoftODL support files is identified by the path for system-wide ODL files in the database location configuration file `/usr/var/i18n/conf/cp_dirs`. ODL files for private UDC databases are not downloaded to printers. For optimal performance, the cache value specified for the `ys` field should match the printer cache size. To find out the cache size for a particular printer, see the printer's manual.
`yt`	`str`	`fifo`	The SoftODL character replacement method The `yt` entry applies to text print filters. The value for this entry can be either `fifo` (first-in/first-out) or `lru` (least recently used). You can type either uppercase or lowercase letters for these values. To find out which value is appropriate for a particular printer, see the printer's manual.

The ya symbol is defined for printing languages whose characters are not included in the Latin-1 character set. The value assigned to the ya symbol is a quoted string that can include one or more of the following keywords:

flocale=locale_name
Specifies the locale for interpretation of file text. The print filter uses this locale to validate characters in the text. For an Asian language that is supported by more than one codeset, a difference between the flocale and plocale values determines whether codeset conversion is done before the file is printed. If flocale is not specified, the filter interprets the file in the current locale.

font=font_name
Specifies the name of the outline font for printing PostScript files. This font must be appropriate for the specified plocale value.

line=number_of_lines
Specifies the number of lines for each page. When used in combination with the -w flag of the lpr command, the line number can control the font size and orientation of printed output.

odldb=odl_database_path
Specifies the pathname of the SoftODL database. By default, the printer uses the system-wide database as specified in the cp_dirs file.

odlstyle=style-NxN
Specifies the SoftODL font style and size to use, for example, normal-24x24. If odlstyle is not specified, the default style and size set for the system-wide database is used.

onehalf
For the Thai language, specifies that characters be printed on one-and-one-half lines, rather than three lines, to produce more compressed and natural-looking output. The onehalf keyword is valid only for the thailpof print filter.

plocale=locale_name
Specifies the printer locale. Print filters for some printers, such as the LA380-CB printer, are country specific because the printer has built-in fonts that are encoded in a particular codeset. For these printers, you must specify plocale, and the codeset part of locale_name must match the codeset of the built-in fonts. Other printers are generic and suitable for printing files in a variety of languages. For these printers, you can use a generic print filter (such as wwpsof) and do not need to specify a plocale value unless you want to use a font that is not the default for the language being printed.

spcom
Enables space-compensation mode for languages, such as Thai, that contain nonspacing characters. These characters can combine with other characters for display and therefore do not occupy space. Many of the existing tools that align text do not handle nonspacing characters correctly. If you want to print the Thai output that these tools generate, you should specify the spcom keyword to ensure proper text alignment in the printed file. This keyword is valid only when used with a Thai print filter or with the th_TH.TACTIS plocale value.

tacdata=tac_data_path
Specifies the location of the character code tables used with the thailpof print filter. By default, tac_data_path is /usr/lbin/tac_data.

tm
Enables text morphing for printing Thai characters. Text morphing replaces some characters with others to produce better printed output. See Thai(5) for information on text morphing.

1.7.4 Enhancements to Printer Configuration Software

The Printing selection on the SysMan Menu is the desktop application that helps you add, delete, or change the characteristics of the printers on your system. The lprsetup utility is an alternative way to do these operations if your system is not running CDE. In both cases, the software performs necessary tasks, such as creating the printer spooling directory, linking the appropriate filter to the printer, and writing the entry for the printer in the /etc/printcap file. You must be superuser to run lprsetup. See lprsetup.dat(4) for information about mapping the product names of supported printers to their system identifiers. See the System Administration manual for detailed information and examples for printer setup.

Example 1-1 demonstrates how you use the lprsetup command to set up a local language printer, in this case ln03s-ja.

Example 1-1: Setting Up a Local Language Printer with lprsetup

# /usr/sbin/lprsetup   [1]
Tru64 UNIX Printer Setup Program
 
Command < add modify delete exit view quit help >: add
 
Adding printer entry, type '?' for help.
 
Enter printer name to add [lp11] : [2]
 
Printer Types:
 
  1. Compaq Advanced Server ClientPS
  2. Compaq Advanced Server ClientText
  3. Compaq LN16
  4. Compaq LN32
  5. Digital Colormate PS
  6. Digital DEClaser 1100
  7. Digital DEClaser 1150
  8. Digital DEClaser 2100
  9. Digital DEClaser 2150
 10. Digital DEClaser 2200
 11. Digital DEClaser 2250
 12. Digital DEClaser 2300
 13. Digital DEClaser 2400
 14. Digital DEClaser 3200
 15. Digital DEClaser 3250
 16. Digital DEClaser 3500
 17. Digital DEClaser 5100
 18. Digital LA100
 19. Digital LA120
 20. Digital LA210
 21. Digital LA280
 22. Digital LA30N
 23. Digital LA30N A4
 24. Digital LA30W
 25. Digital LA30W A4
 26. Digital LA324
 27. Digital LA380
 28. Digital LA380CB
 29. Digital LA380K
 30. Digital LA400
 31. Digital LA424
 32. Digital LA50
 33. Digital LA600
 34. Digital LA70
 35. Digital LA75
 36. Digital LA84
 37. Digital LA86
 38. Digital LA88
 39. Digital LA88C
 40. Digital LA90
 41. Digital LG02
 42. Digital LG04 Plus
 43. Digital LG05 Plus
 
Press 'ENTER' to continue scrolling, type '(q)uit' to end scrolling: q 
 
Help Types:
 
  ?         - General help
  printer?  - Specific printer type information
 
Enter index number, help type, '(q)uit', or 'ENTER' [Generic Unknown type]:ln03ja  [3]

.
.
.
Enter printer synonym: draft  [4]
 
Enter printer synonym: 
 
Set device pathname 'lp' [] ?  /foo
 
Do you want to capture print job accounting data ([y]|n)? n 
 
Set spooler directory 'sd' [/var/spool/printers/lpd11] ?  
 
Set printer error log file 'lf' [/var/adm/printers/lp11.lperr] ?  
 
Enter the name of the printcap symbol you wish to modify.  Other        
valid entries are:                                                      
        'q'     to quit (no more changes)                               
        'p'     to print the symbols you have specified so far          
        'l'     to list all of the possible symbols and defaults        
The names of the printcap symbols are:                                  
 
 af  br  cf  ct  df  dn  du  fc  ff  fo  fs  gf  if  lf  lo  lp 
 mc  mj  mx  nc  nf  of  on  pl  pp  ps  pw  px  py  rf  rm  rp 
 rs  rw  sb  sc  sd  sf  sh  st  tf  tr  vf  xc  xf  xn  xs  ya 
 yd  yj  yp  ys  yt 
 
Enter symbol name: yt    [5]
 
Enter a new value for symbol 'yt'?  [none] [Return]
 
Enter symbol name: ?
 
Enter the name of the printcap symbol you wish to modify.  Other        
valid entries are:                                                      
        'q'     to quit (no more changes)                               
        'p'     to print the symbols you have specified so far          
        'l'     to list all of the possible symbols and defaults        
The names of the printcap symbols are:                                  
 
 af  br  cf  ct  df  dn  du  fc  ff  fo  fs  gf  if  lf  lo  lp 
 mc  mj  mx  nc  nf  of  on  pl  pp  ps  pw  px  py  rf  rm  rp 
 rs  rw  sb  sc  sd  sf  sh  st  tf  tr  vf  xc  xf  xn  xs  ya 
 yd  yj  yp  ys  yt 
 
Enter symbol name: q    [6]
        Printer #11 
        -----------
Symbol  type  value
------  ----  -----
  if    STR    /usr/lbin/ppdof +OPageSize=Letter +Ctektronix740.rpd
  lf    STR    /var/adm/printers/lp11.lperr
  lp    STR    /foo
  mx    INT    0
  of    STR    /usr/lbin/ppdof +OPageSize=Letter +Ctektronix740.rpd
  pl    INT    66
  pw    INT    0
  rw    BOOL   on
  sd    STR    /var/spool/printers/lpd11
  xf    STR    /usr/lbin/xf
 
Are these the final values for printer 11 ? [y] [Return] 
 
 
Adding comments to printcap file for new printer, type '?' for help.
Do you want to add comments to the printcap file [n] ? : [Return] 
 
Setup activity is complete for this printer.
Verify that the printer works properly by using
the lpr(1) command to send files to the printer.
 
 
Command  < add modify delete exit view quit help >: e [7]

Invokes the lprsetup program. [Return to example]

Displays the available printer types (see the reference pages for specific languages for information on the local language printers supported by the lrpsetup command). [Return to example]

Enters the printer type. To obtain a description of individual printers, enter printer?. [Return to example]

The utility displays a series of prompts that allow you to specify a synonym for the printer name, device path, whether accounting data will be maintained, and selection of a spooler directory and error log. Enter a question mark to obtain help on any of the prompts. [Return to example]

Prompts you to enter a printcap symbol. See Table 1-2 for the symbols and parameters of importance to internationalized systems. For example, ys sets the cache size that the SoftODL service uses. By default, this value is the appropriate cache size for the printer and is stored as the value of the ys symbol in the /etc/printcap file. [Return to example]

Quits the lprsetup dialogue. The utility displays the values assigned and prompts for verification. Once verified, you are prompted to add comments to the /etc/printcap file. [Return to example]

Quits the program to indicate no more changes are needed to the /etc/printcap file. [Return to example]

1.7.5 Print Commands and the Printer Daemon

The lp, lpc, lpd, lpq, lpr, lprm, and lpstat commands handle the features added to the print subsystem for Asian and other languages not in the Latin-1 group. For example, the lpr command includes the -A option and additional values for the -O option to give you access to such features. See lpr(1) for details about local language options and values.

1.7.6 Choosing PostScript Fonts for Different Locales

The fonts for the Chinese and Korean languages do not fit in the memory of most PostScript printers. Fonts for the Thai language and some European languages do fit in memory, but are large enough that they do not fit in printer memory along with fonts for other languages.

For PostScript printers in which language-specific fonts are not printer resident, the wwpsof print filter (see Section 1.7.1.2) provides a solution. In this case, you specify in a printer's configuration file the names of the fonts you want to use for different languages. The wwpsof print filter can also create PostScript output from bitmap fonts when PostScript fonts are not available for a particular codeset. See wwpsof(8) for more information about using this print filter.

The following list associates languages and codesets with the appropriate set of PostScript fonts:

Western European, Latin-1 (*.ISO8859-1)
PostScript fonts for Latin-1 languages are printer resident; they are not installed from software subsets.

Hungarian, Czech, Slovak, Slovene (*.ISO8859-2)

Arial-Bold-ISOLatin2
Arial-BoldItalic-ISOLatin2
Arial-Italic-ISOLatin2
Arial-ISOLatin2
ArialNarrow-Bold-ISOLatin2
ArialNarrow-BoldItalic-ISOLatin2
ArialNarrow-Italic-ISOLatin2
ArialNarrow-ISOLatin2
BookAntiqua-Bold-ISOLatin2
BookAntiqua-BoldItalic-ISOLatin2
BookAntiqua-Italic-ISOLatin2
BookAntiqua-ISOLatin2
BookmanOldStyle-Bold-ISOLatin2
BookmanOldStyle-BoldItalic-ISOLatin2
BookmanOldStyle-Italic-ISOLatin2
BookmanOldStyle-ISOLatin2
CenturyGothic-Bold-ISOLatin2
CenturyGothic-BoldItalic-ISOLatin2
CenturyGothic-Italic-ISOLatin2
CenturyGothic-ISOLatin2
CenturySchoolbook-Bold-ISOLatin2
CenturySchoolbook-BoldItalic-ISOLatin2
CenturySchoolbook-Italic-ISOLatin2
CenturySchoolbook-ISOLatin2
Courier-Bold-ISOLatin2
Courier-BoldItalic-ISOLatin2
Courier-Italic-ISOLatin2
Courier-ISOLatin2
MonotypeCorsiva-ISOLatin2
TimesNewRoman-Bold-ISOLatin2
TimesNewRoman-BoldItalic-ISOLatin2
TimesNewRoman-Italic-ISOLatin2
TimesNewRoman-ISOLatin2

Russian (*.ISO8859-5)

Arial-Bold-ISOLatinCyrillic
Arial-BoldInclined-ISOLatinCyrillic
Arial-Inclined-ISOLatinCyrillic
Arial-ISOLatinCyrillic
Courier-Bold-ISOLatinCyrillic
Courier-BoldInclined-ISOLatinCyrillic
Courier-Inclined-ISOLatinCyrillic
Courier-ISOLatinCyrillic
Nimrod-Bold-ISOLatinCyrillic
Nimrod-BoldInclined-ISOLatinCyrillic
Nimrod-Inclined-ISOLatinCyrillic
Nimrod-ISOLatinCyrillic
Plantin-Bold-ISOLatinCyrillic
Plantin-BoldInclined-ISOLatinCyrillic
Plantin-Inclined-ISOLatinCyrillic
Plantin-ISOLatinCyrillic
TimesNewRoman-Bold-ISOLatinCyrillic
TimesNewRoman-BoldInclined-ISOLatinCyrillic
TimesNewRoman-Inclined-ISOLatinCyrillic
TimesNewRoman-ISOLatinCyrillic

Greek (*.ISO8859-7)

Arial-Bold-ISOLatinGreek
Arial-BoldInclined-ISOLatinGreek
Arial-Inclined-ISOLatinGreek
Arial-ISOLatinGreek
Courier-Bold-ISOLatinGreek
Courier-BoldInclined-ISOLatinGreek
Courier-Inclined-ISOLatinGreek
Courier-ISOLatinGreek
TimesNewRoman-Bold-ISOLatinGreek
TimesNewRoman-BoldInclined-ISOLatinGreek
TimesNewRoman-Inclined-ISOLatinGreek
TimesNewRoman-ISOLatinGreek

Hebrew (*.ISO8859-8)

David-Bold-ISOLatinHebrew
David-BoldOblique-ISOLatinHebrew
David-ISOLatinHebrew
David-Oblique-ISOLatinHebrew
FrankRuhl-Bold-ISOLatinHebrew
FrankRuhl-BoldOblique-ISOLatinHebrew
FrankRuhl-ISOLatinHebrew
FrankRuhl-Oblique-ISOLatinHebrew
Miriam-Bold-ISOLatinHebrew
Miriam-BoldOblique-ISOLatinHebrew
Miriam-ISOLatinHebrew
Miriam-Oblique-ISOLatinHebrew
MiriamFixed-Bold-ISOLatinHebrew
MiriamFixed-BoldOblique-ISOLatinHebrew
MiriamFixed-ISOLatinHebrew
MiriamFixed-Oblique-ISOLatinHebrew
NarkissTam-Bold-ISOLatinHebrew
NarkissTam-BoldOblique-ISOLatinHebrew
NarkissTam-ISOLatinHebrew
NarkissTam-Oblique-ISOLatinHebrew

Turkish (*.ISO8859-9)

Arial-Bold-ISOLatin5
Arial-BoldItalic-ISOLatin5
Arial-Italic-ISOLatin5
Arial-ISOLatin5
ArialNarrow-Bold-ISOLatin5
ArialNarrow-BoldItalic-ISOLatin5
ArialNarrow-Italic-ISOLatin5
ArialNarrow-ISOLatin5
BookAntiqua-Bold-ISOLatin5
BookAntiqua-BoldItalic-ISOLatin5
BookAntiqua-Italic-ISOLatin5
BookAntiqua-ISOLatin5
BookmanOldStyle-Bold-ISOLatin5
BookmanOldStyle-BoldItalic-ISOLatin5
BookmanOldStyle-Italic-ISOLatin5
BookmanOldStyle-ISOLatin5
CenturyGothic-Bold-ISOLatin5
CenturyGothic-BoldItalic-ISOLatin5
CenturyGothic-Italic-ISOLatin5
CenturyGothic-ISOLatin5
CenturySchoolbook-Bold-ISOLatin5
CenturySchoolbook-BoldItalic-ISOLatin5
CenturySchoolbook-Italic-ISOLatin5
CenturySchoolbook-ISOLatin5
Courier-Bold-ISOLatin5
Courier-BoldItalic-ISOLatin5
Courier-Italic-ISOLatin5
Courier-ISOLatin5
MonotypeCorsiva-ISOLatin5
TimesNewRoman-Bold-ISOLatin5
TimesNewRoman-BoldItalic-ISOLatin5
TimesNewRoman-Italic-ISOLatin5
TimesNewRoman-ISOLatin5

Latin-9 (*.ISO8859-15)
No PostScript fonts are supplied for locales using Latin-9 encoding. However, a default printer configuration file (PCF) is available for Latin-9. When specified for the wwpsof print filter, this file enables automatic conversion of Latin-9 bitmap fonts to PostScript. See wwpsof(8) for more information.

Unicode (*.UTF-8)
The X locale database file used by applications running in the universal.UTF-8, en_US.UTF-8, or Asian locales (Chinese, Japanese, Korean) contains font definitions that include all the fonts used with the operating system. This enables applications under en_US.UTF-8 to display all the font characters installed with Worldwide Language Support (WLS). Applications under the Asian locales display all the font characters installed with WLS, except for ISO8859-2, -4, -5, -7, -8, -9, and TACTIS.
See wwpsof(8) and Unicode(5) for more information.

Traditional Chinese (*.dechanyu).
The operating system provides the following traditional Chinese outline fonts for printing on PostScript printers and for display through Level II Display Postscript extension. For information on use of these fonts with PostScript printers, Display PostScript, or display through a rasterizer, see the Technical Reference for Using Chinese Features online manual.
```
Sung-Light-CNS11643
Hei-Light-CNS11643
```

Simplified Chinese (*.dechanzi).
The operating system provides the following simplified Chinese outline fonts for printing on PostScript printers and for display through Level II Display Postscript extension. For information on use of these fonts with PostScript printers, Display PostScript, or display through a rasterizer, see the Technical Reference for Using Chinese Features online manual.
```
XiSong-GB2312-80
Hei-GB2312-80
```

Korean (*.deckorean)
```
Munjo
```

Japanese (*.deckanji)
PostScript fonts for the Japanese language are normally printer resident; they are not installed from software subsets. However, you can set up a printer configuration file (PCF) with the wwpsof print filter to convert Japanese bitmap fonts to PostScript for printing files that use Japanese encoding. See wwpsof(8) for more information.

Thai (*.TACTIS)

AngsanaUPC-Bold
AngsanaUPC-BoldItalic
AngsanaUPC-Italic
AngsanaUPC-Light
CordiaUPC-Bold
CordiaUPC-BoldItalic
CordiaUPC-Italic
CordiaUPC-Light
EucrosiaUPC-Bold
EucrosiaUPC-BoldItalic
EucrosiaUPC-Italic
EucrosiaUPC-Light
FreesiaUPC-Bold
FreesiaUPC-BoldItalic
FreesiaUPC-Italic
FreesiaUPC-Light
IrisUPC-Bold
IrisUPC-BoldItalic
IrisUPC-Italic
IrisUPC-Light
JasmineUPC-Bold
JasmineUPC-BoldItalic
JasmineUPC-Italic
JasmineUPC-Light
KodchiangUPC-Bold
KodchiangUPC-BoldItalic
KodchiangUPC-Italic
KodchiangUPC-Light
LilyUPC-Bold
LilyUPC-BoldItalic
LilyUPC-Italic
LilyUPC-Light
WaterlilyUPC-Bold
WaterlilyUPC-BoldItalic
WaterlilyUPC-Italic
WaterlilyUPC-Light
YuccaUPC-Bold
YuccaUPC-BoldItalic
YuccaUPC-Italic
YuccaUPC-Light

1.8 Using Mail in a Multilanguage Environment

The operating system provides enhanced versions of the following commands and utilities to handle languages based on multibyte-character codesets:

sendmail

mailx

MH (mail handler)

The following sections discuss enhancements to these components and codeset conversion done by the comsat server. See sendmail(8), mailx(1), mh(1), comsat(8) for more complete software descriptions.

1.8.1 The sendmail Utility

The sendmail utility, which is a back end to several user commands, is configured by default to support 8-bit data. The configuration that supports 8-bit data is required for multibyte character support. See sendmail(8) for restrictions that apply to the 8-bit configuration.

1.8.2 The mailx Command and MH Commands

The mailx command and all applicable commands in the MH system support the conversion of mail messages between the mail interchange codeset (used to transfer messages to some hosts) and a user's application codeset. For example, if the mail interchange codeset is ISO-2022-JP and the application codeset is eucJP, the mailx or MH command converts incoming messages to the Japanese EUC codeset before displaying them.

To prevent data loss, when incoming messages are stored in mail folders, the messages are encoded in the codeset in which they are received. Codeset conversion takes place when you extract or display the messages.

To communicate mail interchange code information to other systems, outgoing messages include two additional header lines like the following:

Mime-Version: 1.0
 
Content-Type: TEXT/PLAIN; charset=ISO-2022-JP

The charset field in the preceding example specifies the mail interchange codeset, in this case, ISO-2022-JP. This codeset is an ISO 7-bit state-dependent codeset for Japanese characters. Codesets other than those that are part of the ISO standard are identified by the prefix X- in the codeset name. For example, when DEC Hanyu is the codeset used for mail interchange, the following header lines are included in outgoing mail messages:

Mime-Version: 1.0
 
Content-Type: TEXT/PLAIN; charset=X-dechanyu

The mailx command and MH commands determine the application codeset and set the mail interchange codeset for incoming and outgoing messages based on certain values. The following lists describe, in priority order of highest to lowest, the values these commands use.

The application codeset is determined from one of the following:
1. The setting of the LANG environment variable
2. The value of the lang component in the $HOME/.mailrc file (for the mailx command) or the $HOME/.mh_profile file (for MH commands)

The mail interchange codeset applied to incoming messages is determined from one of the following:
1. The charset field in the mail header, if additional header lines are present in the message
2. The codeset specified as the system-wide mail interchange default in the /usr/lib/mail-codesets file
  If you create this file, make sure it contains the name of a locale as the only entry.
If neither of the preceding values is available, codeset conversion does not occur.

The mail interchange codeset applied to outgoing messages is determined from one of the following:
1. The setting of the EXCODE environment variable
2. The setting of the excode component as defined in the $HOME/.mailrc file (for mailx users) or the $HOME/.mh_profile file (for users of MH commands)
3. The content of the /usr/lib/mail-codesets file
If a codeset is not determined for outgoing mail interchange, the mail is sent with no codeset identifier.

1.8.3 The comsat Server

The comsat server, which notifies you of incoming mail messages, always attempts to convert incoming mail messages from the mail interchange codeset to your application codeset. The following lists describe, in priority order of highest to lowest, the values that the comsat server uses to determine the codesets of the mail interchange and your application.

The mail interchange codeset value is determined from one of the following:
1. The charset field, if included in the mail message header
2. The codeset specified as the system-wide mail interchange default in the /usr/lib/mail-codesets file
  If neither of the preceding values is available, codeset conversion does not occur.

The application codeset value is determined from one of the following:
1. The application codeset defined for the atty driver of the user's system
2. The codeset name in the$HOME/.codeset_device_name file, where device_name is the name of the terminal device for the current session

1.9 Displaying Reference Pages in Languages Other Than English

As with the operating system, internationalized applications frequently supply online reference pages (manpages) to document the application and its components. The operating system includes enhanced versions of the nroff, tbl, and man commands to support this requirement.

The nroff and tbl commands are tools used primarily by programmers to create reference pages. These commands are described in the Writing Software for the International Market manual.

The man command formats and displays the reference page and can handle multibyte characters in reference page files. By default, the man command automatically searches for reference pages in the/usr/share/locale_name /man directory before searching the /usr/share/man and /usr/local/man directories. Therefore, if the LANG environment variable is set to an installed locale and if reference page translations are available for that locale, the man command automatically displays reference pages in the appropriate language.

In addition, the man command automatically applies codeset conversion (assuming the availability of appropriate converters) when reference page translations for a particular language are encoded in a codeset that does not match the codeset of the user's locale. See man(1) for information about redefining the man command search path and for more details about codeset conversion.

1.10 Converting Data Files from One Codeset to Another

Each locale is based on a specific codeset. Therefore, when an application uses a file whose data is coded in one codeset and runs in a locale based on another codeset, character interpretation may be meaningless. You may need to set the process environment to a particular locale and use a data file created with a codeset different from the one on which the locale is based. The data file in question might be appropriate for a given language and in a codeset different from your locale for one of the following reasons:

The data file might have been created on another vendor's system by using a locale based on a vendor-specific codeset. For example, the integration of PCs into the enterprise computing environment increases the likelihood that UNIX users need to process files for which the data encoding is in MS-DOS code page format.

The locale could be one of several UNIX locales that support the same Asian language, such as Japanese. Asian languages are typically supported by a variety of locales, each based on a different codeset.

The data file could be in Unicode transformation format (UTF-8, UTF-16, or UTF-32). If characters in this file are to be printed or displayed on the screen, they might need to be converted to encodings for which fonts are available.

You can convert a data file from one codeset to another by using the iconv command. Consider the following example:

% iconv -f SJIS -t eucJP accounts_local \
>> accounts_central

This iconv command performs the following tasks:

Reads data in the accounts_local file, which is encoded in the SJIS codeset

Converts the data to the eucJP codeset

Appends the results to the accounts_central file

An application programmer, on the other hand, might use the iconv_open(), iconv(), and iconv_close() functions for the same purpose. Many commands and utilities, such as the man command and internationalized print filters, use the iconv() functions and associated converters to perform codeset conversion on the user's behalf.

See the Writing Software for the International Market manual and iconv(3) for a full description of how to use iconv and algorithmic and table converters in internationalized programs.

1.11 Miscellaneous Base System Commands

The following list includes information about features and restrictions that apply when using traditional UNIX commands in local language environments:

file and jfile alias
The file command recognizes files encoded in Unicode or ISO 10646 formats (16-bit UCS-2 or 32-bit UTF-32). For other kinds of text files, the command recognizes when the character encoding is valid for the codeset of the current locale. The file command also has a jfile alias. When you use this alias, the command recognizes the most commonly used encodings for Japanese (DEC Kanji, Japanese EUC, Shift JIS, and 7-bit JIS) regardless of the current locale setting. For more information, see file(1).

rlogin
When you use the rlogin command to log on to a Tru64 UNIX system from an ULTRIX system, be sure to specify the -8 flag to pass 8-bit data without stripping. Otherwise, you will have problems entering non-ASCII characters from your terminal.
If you view a large data file while logged on to the remote system, use a pager command, such as pg, and not the Hold Screen key. The -8 option sets the terminal mode of the original host to RAW, disabling flow control. So, if data is sent to the terminal at a rate faster than the terminal can handle it, some data is lost when you use the Hold Screen key.
This rlogin restriction applies when logging in from an ULTRIX system or when logging in from any UNIX system whose software does not fully support 8-bit data format.

Emacs editor
The operating system includes the multilingual Emacs software from the Free Software Foundation. Before using this editor, you must add the /usr/i18n/mule/bin directory to your process-specific search path. You can then invoke this editor by using the mule command.

vi and more
The vi and more commands discard text that follows an invalid multibyte character. If you encounter this problem, it is likely that your locale setting is not correct for the text being viewed or edited. In this case, reset your locale to one that matches the text and invoke the command again.
When used with Thai characters, the vi editor may wrap lines before the right boundary of the screen. This happens because Thai text includes nonspacing characters, which contribute to the character count but not to the display width. The editor wraps lines based on character count. For example, vi may wrap a line after entry of 80 characters, even though these characters do not occupy 80 columns on the screen.

Using local language user names and file names
It is a limitation of UNIX file systems that you cannot use a multibyte character whose second or subsequent byte is an ASCII slash (/) in names of files, users, or other objects. This limitation means that some user-defined characters in the DEC Hanzi and DEC Kanji codesets and certain characters (CNS Plane 2 characters) in the DEC Hanyu codeset cannot be used in these names.