16    Archives

An archive is a collection of files stored and treated as a single entity. They are used most commonly to implement libraries of relocatable objects. These libraries simplify linking in a program development environment by allowing the manipulation of one archive file instead of dozens or hundreds of object files.

This chapter covers the archive file format and usage. The archiver is the tool used to create and manage archives. See ar(1) for more information on its facilities.

16.1    New and Changed Archive Features

Tru64 UNIX V5.0 introduces archive support for extended user and group ids (see ar_uid and ar_gid in Section 16.2.2)

16.2    Structures, Fields, and Values for Archives

All declarations in this section are from the header file ar.h.

See Section 16.3.1 for more information on the organization of object file contents.

16.2.1    Archive Magic String

The archive magic string identifies a file as an archive.

#define ARMAG "!<arch>\n"
#define SARMAG 8

16.2.2    Archive Header

struct ar_hdr {
        char    ar_name[16];    
        char    ar_date[12];   
        char    ar_uid[6];    
        char    ar_gid[6];   
        char    ar_mode[8]; 
        char    ar_size[10]; 
        char    ar_fmag[2];  
} AR_HDR;

SIZE - 60 bytes, ALIGNMENT - 1 byte

Archive Header Fields

ar_name

File member name, blank-terminated if the length of the name is less than 16 bytes.

File member names that are 16 characters or longer are stored in the special file member called the file member name table. In that case, this field contains /offset where offset indicates the byte offset of the file name within the table. The offset is a decimal number.

The prefix ARSYMPREF, defined as the 16-byte blank-terminated character string ________64ELEL_, is stored in this field for the special file member called the symbol definitions (symdef) file and is used to identify that file. The ar tool marks an out of date symdef file by changing the last L in the name to an X (________64ELEX_).

The blank-terminated name // is stored in this field to identify the file member name table.

ar_date

File member date (decimal).

ar_uid

File member user id (decimal).

For a file with a user id greater than USHRT_MAX (65535U), this field will contain //value where value is a 4-byte unsigned integer.


Version Note

Large user ids are supported in Tru64 UNIX V5.0 and greater.


ar_gid

File member group id (decimal).

For a file with a group id greater than USHRT_MAX (65535U), this field will contain //value where value is a 4-byte unsigned integer.


Version Note

Large group ids are supported in Tru64 UNIX V5.0 and greater.


ar_mode

File member mode (octal).

ar_size

File member size (decimal). Sizes reflect padding for the symdef file and the file name table, but not for file member contents. File members always start on even byte boundaries. Therefore, if the ar_size field indicates an odd length, it should be rounded up to the next even number.

ar_fmag

Archive magic string. The possible values are shown in Table 16-1.

Table 16-1:  Archive Magic Strings

Symbol Value Meaning
ARFMAG "'\n" File member. May be a special file member or any type of file other than a compressed object file.
ARFZMAG "Z\n" Compressed object file member.

General Note:

Archive header fields are stored as character strings and must be converted to numeric types.

16.2.3    Hash Table (ranlib) Structure

This structure is found only inside the special file member called the "symdef file". See Section 16.3.2 for related information.

struct  ranlib {
        union {
            int      ran_strx; 
        } ran_un;
        int          ran_off; 
};

SIZE - 8 bytes, ALIGNMENT - 4 bytes

Ranlib Structure Fields

ran_strx

Symdef string table index for this symbol's name.

ran_off

Byte offset from the beginning of the archive file to the archive header of the member that defines this symbol.

General Note:

The ran_un union of this structure has only one field, as shown, for historical reasons.

16.3    Archive Implementation

16.3.1    Archive File Format

The first SARMAG (8) bytes in an archive file identify it as an archive. To verify that a file is an archive, these bytes should be compared with the archive magic string, defined as ARMAG in the header file ar.h.

An archive file consists of the magic string followed by multiple file members, each of which is preceded by an archive file member header. File members can be object files, compressed object files, text files, or files of any other type, and an archive can contain a mix of file types. A file member can also be one of two special file members: the symbol definition (or symdef file) or the file member name table. Figure 16-1 illustrates this file layout.

Figure 16-1:  Archive File Organization

The symdef file, if present, is the first file member of an archive. Section 16.3.2 for details on the symdef file.

The file member name table consists of file member names that are too long to fit into the 16-byte name field of the archive header. If no file member names are 16 characters or longer, this table is not created. If the table is needed, it is either the first file member or the second (following the symdef file.

The member header for the file name table might look like this:

struct arhdr {
        ar_name = "//              ";
        ar_date = "871488454    ";
        ar_uid  = "0     ";
        ar_gid  = "0     ";
        ar_mode = "0      ";
        ar_size = "54        ";
        ar_fmag = "'\n";
}

Names in the file member name table are separated by a slash (/) and a linefeed (\n). For example, the contents of the file name table for an archive with three long object file names might look like this:

st_cmrlc_basic.o/
st_cmrlc_print.o/
st_object_type.o/

The file member header for a file member whose name is stored in the file name table (in this case, the object st_cmrlc_print.o) might look like this:

struct arhdr {
        ar_name = "/18             ";
        ar_date = "871414955    ";
        ar_uid  = "9442     ";
        ar_gid  = "0     ";
        ar_mode = "100600  ";
        ar_size = "47296     ";
        ar_fmag = "'\n";
}

16.3.2    Symdef File Implementation

The symdef file contains external symbol information for all object file members within an archive. When present, the symdef file is the first file member of the archive. The member header for an up-to-date symdef file might look as follows:

struct arhdr {
        ar_name = "_________64ELEL_  ";
        ar_date = "871488454    ";
        ar_uid = "0     ";
        ar_gid = "0     ";
        ar_mode = "0      ";
        ar_size = "8238      ";
        ar_fmag = "'\n";
}

The symdef file is present if at least one archive file member is an object file. The linker uses it when searching for symbol definitions, as long as the file is up to date. Whenever an archive is modified, the symdef file must be updated or its member name must be changed to reflect the fact that it is outdated (see Section 16.2.2).

The symdef file consists of a hash table and a string table. The contents of the symdef file are as follows:

  1. hash table size: 4 bytes indicating the number of ranlib structures in the hash table

  2. hash table: array of ranlib structures

  3. string table size: 4 bytes indicating the size, in bytes, of the symdef string table

  4. string table: string space containing symbol names

At a minimum, the symdef file should contain the sizes of the hash and string tables, even if the tables are empty.

The hash table contains a ranlib structure for each externally visible symbol defined in any of the archive file members. The total size of the hash table is two times the number of symbols rounded to the next highest power of two. Each symbol has a private hash chain that is used for symbol lookup, as shown in Figure 16-2.

Figure 16-2:  Symdef File Hash Table

The hash function produces two values for any name it is given: a hash value and a rehash value. The hash value is used for the first lookup. If the symbol found is not the right one, the rehash value is used for chaining. The chain is followed until the correct symbol is found or until the search returns to the symbol where it began.

The linker uses the hash structure field ran_off to locate a symbol's definition in the archive. This field contains the byte offset from the beginning of the archive file to the file member header of the member containing the symbol's definition.

Note that symbols appear only once in the symdef file hash table, regardless of how many file members define them.

16.4    Archive Usage

16.4.1    Role As Libraries

One important use of archives is to serve as static libraries that programs can link against. Such archives contain a collection of relocatable object files that can be selectively included in an executable image as required. Archive libraries are the only libraries used in creating static executables. They can also be used in conjunction with shared libraries in dynamic executables.

The linker searches archive libraries during symbol resolution. See the Programmer's Guide or ld(1) for more information.

16.4.2    Portability

The archive file format is designed to meet current UNIX standards in order to assure portability with other UNIX systems.

The format of compressed object files within archives is specific to Tru64 UNIX. See Section 1.4.3 for details.