6    Symbol Table

One of the chief tasks of the compilation process is the production of a symbol table, which is a collection of data structures whose purpose is to store type, scope, and address information about program data. Compilers and assemblers create the symbol table. It is read and may be modified by linkers, profiling tools, and assorted object manipulation tools. It also contains information required for debugging.

For large applications, a single compilation can involve many program components, including source files, header files, and libraries. Data from all of these files must be described in the symbol table.

The Tru64 UNIX eCOFF symbol table, when present, comprises a large portion of the physical object file and is often considered a stand-alone entity. It is divided into numerous sections, including a header section that is used for navigation. The contents of the symbol table are shown in Figure 6-1.

Figure 6-1:  Symbol Table Sections

The symbol table has a hierarchical design. The sections storing local symbols, local strings, relative file descriptors, procedure descriptors, line numbers, auxiliary symbols, and optimization symbols are divided into subtables and organized by file. Local symbols, local strings, and optimization symbols are further broken down by procedure. Figure 6-2 depicts this hierarchy.

Figure 6-2:  Symbol Table Hierarchy

A particular symbol table may not contain all sections, for one of the following reasons:

The function of each symbol table section is summarized below:

Several tools are available to view the contents of the symbol table. See stdump(1), odump(1), and nm(1).

This chapter covers symbol table organization and usage, concentrating on the overall structure. Subsequent chapters will cover more detailed aspects of information contained in the symbol table.

The current version of the symbol table is V3.14. The dynamic symbol table built by the linker is discussed separately in Section 14.3.3.

6.1    New or Changed Symbol Table Features

Tru64 UNIX V5.1B includes the following new or changed features:

Version 3.13 of the symbol table includes the following new or changed features:

6.2    Structures, Fields and Values for Symbol Tables

Unless otherwise specified, all structures described in this section are declared in the header file sym.h, and all constants are defined in the header file symconst.h.

6.2.1    Symbolic Header (HDRR)

typedef struct {
        coff_ushort     magic;          
        coff_ushort     vstamp;         
        coff_int        ilineMax;       
        coff_int        idnMax;         
        coff_int        ipdMax;         
        coff_int        isymMax;        
        coff_int        ioptMax;        
        coff_int        iauxMax;        
        coff_int        issMax;         
        coff_int        issExtMax;      
        coff_int        ifdMax;         
        coff_int        crfd;           
        coff_int        iextMax;        
        coff_long       cbLine;         
        coff_off        cbLineOffset;   
        coff_off        cbDnOffset;     
        coff_off        cbPdOffset;     
        coff_off        cbSymOffset;    
        coff_off        cbOptOffset;    
        coff_off        cbAuxOffset;    
        coff_off        cbSsOffset;     
        coff_off        cbSsExtOffset;  
        coff_off        cbFdOffset;     
        coff_off        cbRfdOffset;    
        coff_off        cbExtOffset;    
} HDRR, *pHDRR;

SIZE - 144 bytes, ALIGNMENT - 8 bytes

Symbolic Header Fields

magic

To verify validity of the symbol table, this field must contain the constant magicSym, defined as 0x1992.

vstamp

Symbol table version stamp. This value consists of a major version number and a minor version number, as defined in the stamp.h header file:

Symbol Value Description
MAJ_SYM_STAMP 3 Current major object format version
MIN_SYM_STAMP 14 Current minor object format version

See Section 1.4.5 for a description of object and symbol table versioning.

ilineMax

Number of line number entries (if expanded).

idnMax

Obsolete.

ipdMax

Number of procedure descriptors.

isymMax

Number of local symbols.

ioptMax

Byte size of optimization symbol table.

iauxMax

Number of auxiliary symbols.

issMax

Byte size of local string table.

issExtMax

Byte size of external string table.

ifdMax

Number of file descriptors.

crfd

Number of relative file descriptors.

iextMax

Number of external symbols.

cbLine

Byte size of (packed) line number entries.

cbLineOffset

Byte offset to start of (packed) line numbers.

cbDnOffset

Obsolete.

cbPdOffset

Byte offset to start of procedure descriptors.

cbSymOffset

Byte offset to start of local symbols.

cbOptOffset

Byte offset to start of optimization entries.

cbAuxOffset

Byte offset to start of auxiliary symbols.

cbSsOffset

Byte offset to start of local strings.

cbSsExtOffset

Byte offset to start of external strings.

cbFdOffset

Byte offset to start of file descriptors.

cbRfdOffset

Byte offset to start of relative file descriptors.

cbExtOffset

Byte offset to start of external symbols.

General Notes:

The size and offset fields describing symbol table sections must be set to zero if the section described is not present.

The cb*Offset fields are byte offsets from the beginning of the object file.

The i*Max fields contain the number of entries for a symbol table section. Legal index values for a symbol table section will range from 0 to the value of the associated i*Max field minus one.

For an explanation of packed and expanded line number entries, see the discussion in Section 7.3.1.

6.2.2    Relative File Descriptor Entry (RFDT)

The relative file descriptor table provides a post-link mapping of file descriptor indices. The purpose of this table is to minimize work for the linker, which does not update symbol table references to local symbols. This information is used to obtain the file offset used to bias local symbol indices. Because this table is also known as the File Indirect Table, two declarations are included in the sym.h header file, as shown here.

typedef int RFDT, *pRFDT;
typedef int FIT, *pFIT;

SIZE - 4 bytes, ALIGNMENT - 4 bytes

See Section 6.3.2 for related information.

6.2.3    Optimization Symbol Entry (PPODHDR)

The optimization symbol table contains information for optimized debugging, basic block profiling, and other miscellaneous procedure-specific data. Each procedure's associated optimization symbol table data begins with an array of PPODHDR structures. See Section 6.3.3 for a description of the optimization symbol table.


Version Note

The following structure definition is for Tru64 UNIX V5.0 and greater. It is used for symbol table format V3.13 and greater.


typedef struct {
        coff_uint       ppode_tag;
        coff_uint       ppode_len;
        coff_ulong      ppode_val;
} PPODHDR, *pPPODHDR;

SIZE - 16 bytes, ALIGNMENT - 8 bytes

Optimization Symbol Entry Fields

ppode_tag

Identifies the kind of data described by this entry.

ppode_len

Indicates the size in bytes of the data that is found in the raw data area for this entry. When this field is zero, the only data is stored in the ppode_val field.

ppode_val

This field is either a pointer to the entry's data or is itself the data. If ppode_len is nonzero, this field is a relative file offset from the beginning of the current PPOD (Per-Procedure Optimization Descriptor ) to the applicable data area. If ppode_len is zero, this field contains the data for the entry.

A PPOD contains multiple PPODHDRs. A PPODHDR and its associated data are collectively referred to as a PPODE (Per-Procedure Optimization Descriptor Entry.) Figure 6-4 in Section 6.3.3 shows several PPODs with multiple PPODHDRs and their data.

Table 6-1:  Optimization Tag Values

Name Value Description
PPODE_STAMP 1 Version number of the PPOD stored in ppode_val. The current PPOD_VERSION value is 1.
PPODE_END 2 End of entries for this PPOD.
PPODE_EXT_SRC 3 Extended source line information. See Section 7.3.1.2.
PPODE_SEM_EVENT 4 Semantic event information. See Section 12.3.1.
PPODE_SPLIT 5 Split lifetime information. See Section 12.3.2.
PPODE_DISCONTIG_SCOPE 6 Discontiguous scope information. See Section 12.3.3.
PPODE_INLINED_CALL 7 Inlined procedure call information. (Reserved for future use.)
PPODE_PROFILE_INFO 8 Profile feedback information. See Chapter 9.
PPODE_WHERE_INLINED 9 (V5.1A - )Procedure inlining site information. (Reserved for future use.)
PPODE_ANNOT_RESERVED_FIRST 64 (V5.1B - ) First object annotation tag. (See Chapter 10.)
PPODE_ANNOT_SUMMARY 64 (V5.1B - ) Object annotation summary. (See Section 10.3.1.1.)
PPODE_ANNOT_RESTRICTED_FIRST 65 (V5.1B - ) First restrictive annotation tag. (See Section 10.3.)
PPODE_ANNOT_RESTRICTED_OFFSET 65 (V5.1B - ) Restricted offset annotation. (See Section 10.3.1.2.)
PPODE_ANNOT_RESTRICTED_INSTRUCTION 66 (V5.1B - ) Restricted instruction annotation. (See Section 10.3.1.3.)
PPODE_ANNOT_RESTRICTED_SEQUENCE 67 (V5.1B - ) Restricted instruction sequence annotation. (See Section 10.3.1.4.)
PPODE_ANNOT_RESTRICTED_CALL 68 (V5.1B - ) Restricted call annotation. (See Section 10.3.1.5.)
PPODE_ANNOT_RESTRICTED_ENTRY 69 (V5.1B - ) Restricted entry annotation. (See Section 10.3.1.6.)
PPODE_ANNOT_RESTRICTED_RETURN 70 (V5.1B - ) Restricted return annotation. (See Section 10.3.1.7.)
PPODE_ANNOT_RESTRICTED_LAST 95 (V5.1B - ) Last restrictive annotation tag. (See Section 10.3.)
PPODE_ANNOT_OPTIMIZATION_FIRST 96 (V5.1B - ) First optimization enabling annotation. (See Section 10.3.)
PPODE_ANNOT_GPREL32_JUMP_TABLE 96 (V5.1B - ) Jump table annotation. (See Section 10.3.1.8.)
PPODE_ANNOT_CALL_SPECIFIED_LINKAGE 97 (V5.1B - ) Call specified linkage annotation. (See Section 10.3.1.9.)
PPODE_ANNOT_ENTRY_SPECIFIED_LINKAGE 98 (V5.1B - ) Entry specified linkage annotation. (See Section 10.3.1.10.)
PPODE_ANNOT_ENTRY_UTILIZED_LINKAGE 99 (V5.1B - ) Entry utilized linkage annotation. (See Section 10.3.1.11.)
PPODE_ANNOT_ENTRY_IMPLEMENTED_LINKAGE 100 (V5.1B - ) Entry implemented linkage annotation. (See Section 10.3.1.12.)
PPODE_ANNOT_RETURN_SPECIFIED_LINKAGE 101 (V5.1B - ) Return specified linkage annotation. (See Section 10.3.1.13.)
PPODE_ANNOT_OPTIMIZATION_LAST 127 (V5.1B - ) Last optimization enabling annotation. (See Section 10.3.)
PPODE_ANNOT_RESERVED_LAST 127 (V5.1B - ) Last object annotation tag. (See Chapter 10.)

6.3    Symbol Table Usage

6.3.1    Levels of Symbolic Information

Different levels of symbolic information can be stored with an object file. Compilers often provide options that allow the user to choose the desired level of symbolic information for their program. This choice may be influenced by size considerations and debugging needs. A trade-off exists between the benefit of saving space in the object file and the amount of information available to tools that consume symbolic information.

It is also possible to change the amount of symbolic information present in a program that has already been compiled and linked. Information can be added or deleted. Two of the most common and useful operations are locally stripping and fully stripping the symbol tables in executable files. Tools that modify linked executables, such as instrumentation tools and code optimizers, may rewrite parts of the symbol table to reflect changes that they made.

6.3.1.1    Compilation Levels

The representation of symbolic information supported by compilers can be broken down into four levels:

  1. Minimal– Only information required for linking

  2. Limited– Source file and line number information for profiling and limited debugging (stack-tracing)

  3. Full– Complete debugging information for non-optimized code

  4. Optimized– Debugging information for optimized code

These levels correspond to the system compiler switches -g0 (minimal), -g1 (limited), -g2 (full), and -g3 (optimized). Table 6-2 shows the symbol table sections that are produced by system compilers at each compilation level.

Table 6-2:  Symbol Table Sections Produced at Various Compilation Levels

Compilation Level
Symbol Table Section Minimal Limited Full Optimized
Symbolic header Yes Yes Yes Yes
File Descriptors Yes Yes Yes Yes
External Symbols Yes Yes Yes Yes
External Strings Yes Yes Yes Yes
Procedure Descriptors Yes Yes Yes Yes
Line Numbers No Yes Yes Yes
Relative File Descriptors No No Yes Yes
Optimization Symbols No Partial Yes Yes
Local Symbols No Partial Yes Yes
Local Strings No Partial Yes Yes
Auxiliary Symbols No Partial Yes Yes

The minimal level of symbolic information that may be produced during compilation includes only the symbol information required for the linker to function properly. This includes external symbol information that is needed to perform symbol resolution and relocation.

If the limited level of symbolic information is requested, line number entries are generated, as well as external symbol information and procedure descriptors. In addition, local symbols for procedures (and the corresponding auxiliary symbols, optimization symbols, and local strings) are present. Limited symbolic information is sufficient to meet the needs of profiling tools. The information present at this level is a subset of that required for full debugger support.

If full symbolic information is included, all symbol table sections are produced in full. This level enables full debugging support with complete type descriptions for local and external symbols. Optimization is disabled.

Optimized symbolic information is designed to balance the aims of performance and debugging capabilities. This level supplies the same information as the full debugging option, but it also allows all compiler optimizations. As a result, some of the correlation is lost between the source code and the executable program.

On Tru64 UNIX systems, users can choose to compile their programs with any one of the four levels of symbolic information. The options -g0, -g1, and -g2 specify increasing levels of symbolic information. The system compiler's default is to produce the minimal level (-g0). Debugging of optimized code (-g3) is supported by the ladebug debugger.

6.3.1.2    Locally Stripped Images

Objects can be produced with only global symbolic information stored in the symbol table. Selection of the -x option causes the linker to create a locally-stripped object. Reasons for stripping local symbolic information include reducing file size and limiting the amount of symbolic information available to end users of an application.

A locally-stripped object is very similar to an object produced with minimal symbolic information (see Section 6.3.1.1). The difference is the consolidation of file descriptors, which the linker does only for locally-stripped objects.

In a locally-stripped image, the file descriptors are included solely for the purpose of identifying source file languages. One file descriptor is present for each source language involved in the compilation. These file descriptors will have their adr field set to addressNil indicating the file descriptors cannot be used to identify text addresses.


Version Note

The preceding use of addressNil is supported in symbol table format V3.13 and greater. In symbol table formats less than V3.13, the file descriptor adr value should be ignored.


The procedure descriptor table is present in full but is rearranged to group procedures by source language. All procedure descriptors for procedures written in a particular source language are thus contiguous, and they reflect the file descriptor's information.

External symbols are also present in a locally-stripped image. The file indices (ifd field) of the external symbols are updated to identify the generic file descriptor for the appropriate source language. The index fields are set to zero to indicate that no type information is available. External symbols with the storage class scNil are removed. These are debugging symbols that are not normally produced for minimal symbol tables.

Limited debugging is possible with locally-stripped objects. Because the procedure descriptors are retained, stack traces are possible. External symbol information can also be viewed, and language-dependent handling of symbols (for example, C++ name demangling) is preserved.

A linked executable file can be locally stripped at any time after its creation using the command ostrip -x. The output is the same as described above. This operation may also alter the raw data of the .comment section. See Chapter 15 for details.

6.3.1.3    (Fully) Stripped Images

Executable files may be fully stripped at any time after creation using either the strip command or the command ostrip -s. Stripping an executable will result in complete removal of the symbol table, including the symbolic header. The file header fields f_symptr and f_nsyms are set to zero to indicate that the file has been stripped.

This operation may also alter the raw data of the .comment section. See Chapter 15 for details.

6.3.2    Source File Merging

Much of the complication of source information stems from the "include" system. When a compilation involves several source files, there may be duplication of the header files included in each source file, or of the source files themselves. To avoid repetition of header file information in the linked object, the linker merges the input objects' included files wherever possible. Compilers mark file descriptors as mergeable or unmergeable. The linker then examines the input file descriptors and performs the merge whenever possible.

The linker considers two file descriptors to be mergeable if all of the following criteria are met:

  1. The file descriptor fMerge bit is set in both (marked as mergeable by compiler).

  2. Files have the same name.

  3. Files are written in the same language.

  4. Files contain the same number of local and auxiliary symbols.

  5. Checksums match.

    The checksums match if either:

    1. Neither file's first auxiliary record is a btChecksum.

    2. Both files' first auxiliary record is a btChecksum and they are identical.

C++ header files may be divided into separate entries for the mergeable and unmergeable parts of the headers. When this occurs, the unmergeable portion is entered as a file with a mangled name as described in Section 13.3.3.

The role of the relative file descriptor (RFD) tables is to track file-relative information after merging. A relative file descriptor table entry maps the index of each file at compile time to its index after linking. After linking, local or auxiliary symbols must be accessed through the RFD table to obtain the updated file descriptor index. This mechanism is necessary because the indices in the local symbol table are not updated when files are merged.

Figure 6-3 is an example of the use of the relative file descriptor table.

Figure 6-3:  Relative File Descriptor Table Example

For a symbol reference composed of a file index and symbol index (offset within file), the relative file descriptor table is used as follows:

  1. To look up given file index in the RFD table to get the updated file index.

  2. To look up new file index in the (merged) file descriptor table to get the base of symbols for that file.

  3. To add symbol index to file's base to access the symbol entry.

See Section 11.3.2.3 for the representation of relative indices in the auxiliary symbol table.

6.3.3    Optimization Symbols


Version Note

Optimization symbols are supported for symbol table format V3.13. and greater.


The optimization symbols section gives individual producers and consumers the ability to communicate information about any aspect of the object file, in any form they choose. New information can be generated at any time with minimal coordination between all producers and consumers.

The optimization section is organized on a per-procedure basis. Each procedure descriptor has a pointer to the optimization symbols in the field PDR.iopt. If no optimization symbols are associated with the procedure, the field contains ioptNil. Otherwise, it contains the index of the first optimization symbol entry for this procedure. Consumers should access the optimization symbols through the procedure descriptors. The optimization section is not present in a locally-stripped object.

This section consists of a sequence of zero or more Per-Procedure Optimization Descriptions (PPODs), as shown in Figure 6-4. Each PPOD's internal structure consists of two parts:

  1. A leading sequence of structured entries using a Tag-Length-Value model to describe subsequent raw data. The structure of the PPOD entry can be found in Section 6.2.3.

  2. The raw data area.

Figure 6-4:  Optimization Symbols Section

This section has the following alignment requirements:

Object file producers must produce either an empty optimization symbols section or a valid one. An empty one has the symbolic header fields cbOptOffset and ioptMax set to zero. If an optimization section is present, but a particular file does not contribute to it, the file descriptor field copt is set to zero. In this case, all procedure descriptors belonging to the file must have their iopt fields set to ioptNil.

Tools that both read and write object files must consume a valid optimization symbols section (if present in the input file) and produce an equivalent and valid section in its output file. If a tool does not know how to process the section contents, the section must be omitted from the output file. If a tool does know how to process portions of the optimization symbols, those portions may be modified and the rest should be removed. The linker concatenates input optimization symbols sections into one output section without reading or modifying any of the entries.

The format and flexible nature of this section are similar by design to the .comment section. The structures are the same size and contain the same fields (with different names), and the rules of navigation are the same. The primary difference is that the optimization section contains procedure-specific information; whereas, the comment section contains object-specific information.