One of the chief tasks of the compilation process is the production of a symbol table, which is a collection of data structures whose purpose is to store type, scope, and address information about program data. Compilers and assemblers create the symbol table. It is read and may be modified by linkers, profiling tools, and assorted object manipulation tools. It also contains information required for debugging.
For large applications, a single compilation can involve many program components, including source files, header files, and libraries. Data from all of these files must be described in the symbol table.
The Tru64 UNIX eCOFF symbol table, when present, comprises a large portion of the physical object file and is often considered a stand-alone entity. It is divided into numerous sections, including a header section that is used for navigation. The contents of the symbol table are shown in Figure 6-1.
Figure 6-1: Symbol Table Sections
The symbol table has a hierarchical design. The sections storing local symbols, local strings, relative file descriptors, procedure descriptors, line numbers, auxiliary symbols, and optimization symbols are divided into subtables and organized by file. Local symbols, local strings, and optimization symbols are further broken down by procedure. Figure 6-2 depicts this hierarchy.
Figure 6-2: Symbol Table Hierarchy
A particular symbol table may not contain all sections, for one of the following reasons:
Relative file descriptors are present in linked objects only.
The line number, auxiliary symbol and optimization symbol tables are produced only when debugging information is requested.
Symbol table information may be partially or entirely removed by post-link object tools.
Optimization symbols are not present in symbol table formats less than V3.13.
The function of each symbol table section is summarized below:
The symbolic header stores the sizes and locations of all other symbol table sections.
The line number table enables debuggers to map machine instructions to source code lines.
The procedure descriptor table contains call-frame information as well as pointers to a procedure's local symbols, line numbers and optimization entries.
The local symbol table describes procedures, static and local data, and user-defined types.
The external symbol table stores information about global symbols.
The relative file descriptor table contains a post-link file descriptor table index mapping for each file in the compilation.
The local and external string tables store local and external symbol names, respectively.
The file descriptor table stores the sizes and locations of each subtable produced for contributing source and include files. It also contains miscellaneous information about each file, such as the source language and the level of symbolic information.
The auxiliary symbol table contains data type information for local and external symbols.
The optimization symbols section stores procedure relative information, including extended source location information and optimized debugging information.
Several
tools are available to view the contents of the symbol table.
See
stdump
(1)odump
(1)nm
(1)
This chapter covers symbol table organization and usage, concentrating on the overall structure. Subsequent chapters will cover more detailed aspects of information contained in the symbol table.
The current version of the symbol table is
V3.14.
The dynamic symbol table built by the linker is discussed
separately in
Section 14.3.3.
6.1 New or Changed Symbol Table Features
Tru64 UNIX V5.1B includes the following new or changed features:
New PPODE tags for object annotation (see Table 6-1)
Version 3.13 of the symbol table includes the following new or changed features:
New optimization symbols section (see Section 6.3.3)
Address of locally stripped
FDR
s
set to
addressNil
(see
Section 6.3.1.2)
6.2 Structures, Fields and Values for Symbol Tables
Unless otherwise specified, all structures described in this section
are declared in the header file
sym.h
, and all constants
are defined in the header file
symconst.h
.
6.2.1 Symbolic Header (HDRR)
typedef struct { coff_ushort magic; coff_ushort vstamp; coff_int ilineMax; coff_int idnMax; coff_int ipdMax; coff_int isymMax; coff_int ioptMax; coff_int iauxMax; coff_int issMax; coff_int issExtMax; coff_int ifdMax; coff_int crfd; coff_int iextMax; coff_long cbLine; coff_off cbLineOffset; coff_off cbDnOffset; coff_off cbPdOffset; coff_off cbSymOffset; coff_off cbOptOffset; coff_off cbAuxOffset; coff_off cbSsOffset; coff_off cbSsExtOffset; coff_off cbFdOffset; coff_off cbRfdOffset; coff_off cbExtOffset; } HDRR, *pHDRR;
SIZE - 144 bytes, ALIGNMENT - 8 bytes
Symbolic Header Fields
magic
To verify validity of the symbol table, this field
must contain the constant
magicSym
,
defined as
0x1992
.
vstamp
Symbol
table version stamp.
This value consists of a major version number and a minor
version number, as defined in the
stamp.h
header file:
Symbol | Value | Description |
MAJ_SYM_STAMP |
3 | Current major object format version |
MIN_SYM_STAMP |
14 | Current minor object format version |
See Section 1.4.5 for a description of object and symbol table versioning.
ilineMax
Number of line number entries (if expanded).
idnMax
Obsolete.
ipdMax
Number of procedure descriptors.
isymMax
Number of local symbols.
ioptMax
Byte size of optimization symbol table.
iauxMax
Number of auxiliary symbols.
issMax
Byte size of local string table.
issExtMax
Byte size of external string table.
ifdMax
Number of file descriptors.
crfd
Number of relative file descriptors.
iextMax
Number of external symbols.
cbLine
Byte size of (packed) line number entries.
cbLineOffset
Byte offset to start of (packed) line numbers.
cbDnOffset
Obsolete.
cbPdOffset
Byte offset to start of procedure descriptors.
cbSymOffset
Byte offset to start of local symbols.
cbOptOffset
Byte offset to start of optimization entries.
cbAuxOffset
Byte offset to start of auxiliary symbols.
cbSsOffset
Byte offset to start of local strings.
cbSsExtOffset
Byte offset to start of external strings.
cbFdOffset
Byte offset to start of file descriptors.
cbRfdOffset
Byte offset to start of relative file descriptors.
cbExtOffset
Byte offset to start of external symbols.
General Notes:
The size and offset fields describing symbol table sections must be set to zero if the section described is not present.
The
cb*Offset
fields are byte offsets from
the beginning of the object file.
The
i*Max
fields contain the number of entries
for a symbol table section.
Legal index values for a symbol table section
will range from 0 to the value of the associated i*Max field minus one.
For an explanation of packed and expanded line number entries, see the
discussion in
Section 7.3.1.
6.2.2 Relative File Descriptor Entry (
RFDT
)
The
relative file descriptor table provides a post-link mapping of file descriptor
indices.
The purpose of this table is to minimize work for the linker, which
does not update symbol table references to local symbols.
This information
is used to obtain the file offset used to bias local symbol indices.
Because
this table is also known as the File Indirect Table, two declarations are
included in the
sym.h
header file, as shown here.
typedef int RFDT, *pRFDT; typedef int FIT, *pFIT;
SIZE - 4 bytes, ALIGNMENT - 4 bytes
See
Section 6.3.2
for related information.
6.2.3 Optimization Symbol Entry (
PPODHDR
)
The optimization symbol table contains
information for optimized debugging, basic block profiling, and other miscellaneous
procedure-specific data.
Each procedure's associated optimization symbol
table data begins with an array of
PPODHDR
structures.
See
Section 6.3.3
for a description of the optimization symbol
table.
Version Note The following structure definition is for Tru64 UNIX V5.0 and greater. It is used for symbol table format V3.13 and greater.
typedef struct { coff_uint ppode_tag; coff_uint ppode_len; coff_ulong ppode_val; } PPODHDR, *pPPODHDR;
SIZE - 16 bytes, ALIGNMENT - 8 bytes
Optimization Symbol Entry Fields
ppode_tag
Identifies the kind of data described by this entry.
ppode_len
Indicates
the size in bytes of the data that is found in the raw data area for this
entry.
When this field is zero, the only data is stored in the
ppode_val
field.
ppode_val
This field
is either a pointer to the entry's data or is itself the data.
If
ppode_len
is nonzero, this field is a relative file offset from
the beginning of the current PPOD (Per-Procedure Optimization Descriptor )
to the applicable data area.
If
ppode_len
is zero,
this field contains the data for the entry.
A PPOD contains multiple
PPODHDR
s.
A
PPODHDR
and its associated data are collectively referred to
as a PPODE (Per-Procedure Optimization Descriptor Entry.)
Figure 6-4
in
Section 6.3.3
shows several PPODs with multiple
PPODHDR
s and their data.
Table 6-1: Optimization Tag Values
Name | Value | Description |
PPODE_STAMP |
1 | Version number of the PPOD stored in
ppode_val .
The current
PPOD_VERSION
value is 1. |
PPODE_END |
2 | End of entries for this PPOD. |
PPODE_EXT_SRC |
3 | Extended source line information. See Section 7.3.1.2. |
PPODE_SEM_EVENT |
4 | Semantic event information. See Section 12.3.1. |
PPODE_SPLIT |
5 | Split lifetime information. See Section 12.3.2. |
PPODE_DISCONTIG_SCOPE |
6 | Discontiguous scope information. See Section 12.3.3. |
PPODE_INLINED_CALL |
7 | Inlined procedure call information. (Reserved for future use.) |
PPODE_PROFILE_INFO |
8 | Profile feedback information. See Chapter 9. |
PPODE_WHERE_INLINED |
9 | (V5.1A - )Procedure inlining site information. (Reserved for future use.) |
PPODE_ANNOT_RESERVED_FIRST |
64 | (V5.1B - ) First object annotation tag. (See Chapter 10.) |
PPODE_ANNOT_SUMMARY |
64 | (V5.1B - ) Object annotation summary. (See Section 10.3.1.1.) |
PPODE_ANNOT_RESTRICTED_FIRST |
65 | (V5.1B - ) First restrictive annotation tag. (See Section 10.3.) |
PPODE_ANNOT_RESTRICTED_OFFSET |
65 | (V5.1B - ) Restricted offset annotation. (See Section 10.3.1.2.) |
PPODE_ANNOT_RESTRICTED_INSTRUCTION |
66 | (V5.1B - ) Restricted instruction annotation. (See Section 10.3.1.3.) |
PPODE_ANNOT_RESTRICTED_SEQUENCE |
67 | (V5.1B - ) Restricted instruction sequence annotation. (See Section 10.3.1.4.) |
PPODE_ANNOT_RESTRICTED_CALL |
68 | (V5.1B - ) Restricted call annotation. (See Section 10.3.1.5.) |
PPODE_ANNOT_RESTRICTED_ENTRY |
69 | (V5.1B - ) Restricted entry annotation. (See Section 10.3.1.6.) |
PPODE_ANNOT_RESTRICTED_RETURN |
70 | (V5.1B - ) Restricted return annotation. (See Section 10.3.1.7.) |
PPODE_ANNOT_RESTRICTED_LAST |
95 | (V5.1B - ) Last restrictive annotation tag. (See Section 10.3.) |
PPODE_ANNOT_OPTIMIZATION_FIRST |
96 | (V5.1B - ) First optimization enabling annotation. (See Section 10.3.) |
PPODE_ANNOT_GPREL32_JUMP_TABLE |
96 | (V5.1B - ) Jump table annotation. (See Section 10.3.1.8.) |
PPODE_ANNOT_CALL_SPECIFIED_LINKAGE |
97 | (V5.1B - ) Call specified linkage annotation. (See Section 10.3.1.9.) |
PPODE_ANNOT_ENTRY_SPECIFIED_LINKAGE |
98 | (V5.1B - ) Entry specified linkage annotation. (See Section 10.3.1.10.) |
PPODE_ANNOT_ENTRY_UTILIZED_LINKAGE |
99 | (V5.1B - ) Entry utilized linkage annotation. (See Section 10.3.1.11.) |
PPODE_ANNOT_ENTRY_IMPLEMENTED_LINKAGE |
100 | (V5.1B - ) Entry implemented linkage annotation. (See Section 10.3.1.12.) |
PPODE_ANNOT_RETURN_SPECIFIED_LINKAGE |
101 | (V5.1B - ) Return specified linkage annotation. (See Section 10.3.1.13.) |
PPODE_ANNOT_OPTIMIZATION_LAST |
127 | (V5.1B - ) Last optimization enabling annotation. (See Section 10.3.) |
PPODE_ANNOT_RESERVED_LAST |
127 | (V5.1B - ) Last object annotation tag. (See Chapter 10.) |
6.3 Symbol Table Usage
6.3.1 Levels of Symbolic Information
Different levels of symbolic information can be stored with an object file. Compilers often provide options that allow the user to choose the desired level of symbolic information for their program. This choice may be influenced by size considerations and debugging needs. A trade-off exists between the benefit of saving space in the object file and the amount of information available to tools that consume symbolic information.
It is also possible to change the amount of symbolic information present
in a program that has already been compiled and linked.
Information can be
added or deleted.
Two of the most common and useful operations are locally
stripping and fully stripping the symbol tables in executable files.
Tools
that modify linked executables, such as instrumentation tools and code optimizers,
may rewrite parts of the symbol table to reflect changes that they made.
6.3.1.1 Compilation Levels
The representation of symbolic information supported by compilers can be broken down into four levels:
Minimal Only information required for linking
Limited Source file and line number information for profiling and limited debugging (stack-tracing)
Full Complete debugging information for non-optimized code
Optimized Debugging information for optimized code
These levels correspond to the system compiler switches -g0 (minimal), -g1 (limited), -g2 (full), and -g3 (optimized). Table 6-2 shows the symbol table sections that are produced by system compilers at each compilation level.
Table 6-2: Symbol Table Sections Produced at Various Compilation Levels
Compilation Level | ||||
Symbol Table Section | Minimal | Limited | Full | Optimized |
Symbolic header | Yes | Yes | Yes | Yes |
File Descriptors | Yes | Yes | Yes | Yes |
External Symbols | Yes | Yes | Yes | Yes |
External Strings | Yes | Yes | Yes | Yes |
Procedure Descriptors | Yes | Yes | Yes | Yes |
Line Numbers | No | Yes | Yes | Yes |
Relative File Descriptors | No | No | Yes | Yes |
Optimization Symbols | No | Partial | Yes | Yes |
Local Symbols | No | Partial | Yes | Yes |
Local Strings | No | Partial | Yes | Yes |
Auxiliary Symbols | No | Partial | Yes | Yes |
The minimal level of symbolic information that may be produced during compilation includes only the symbol information required for the linker to function properly. This includes external symbol information that is needed to perform symbol resolution and relocation.
If the limited level of symbolic information is requested, line number entries are generated, as well as external symbol information and procedure descriptors. In addition, local symbols for procedures (and the corresponding auxiliary symbols, optimization symbols, and local strings) are present. Limited symbolic information is sufficient to meet the needs of profiling tools. The information present at this level is a subset of that required for full debugger support.
If full symbolic information is included, all symbol table sections are produced in full. This level enables full debugging support with complete type descriptions for local and external symbols. Optimization is disabled.
Optimized symbolic information is designed to balance the aims of performance and debugging capabilities. This level supplies the same information as the full debugging option, but it also allows all compiler optimizations. As a result, some of the correlation is lost between the source code and the executable program.
On Tru64 UNIX systems,
users can choose to compile their programs with any one of the four levels
of symbolic information.
The options
-g0,
-g1,
and
-g2
specify increasing levels of symbolic information.
The system compiler's default is to produce the minimal level (-g0).
Debugging of optimized code (-g3) is supported
by the
ladebug
debugger.
6.3.1.2 Locally Stripped Images
Objects can be produced with only global symbolic information stored in the symbol table. Selection of the -x option causes the linker to create a locally-stripped object. Reasons for stripping local symbolic information include reducing file size and limiting the amount of symbolic information available to end users of an application.
A locally-stripped object is very similar to an object produced with minimal symbolic information (see Section 6.3.1.1). The difference is the consolidation of file descriptors, which the linker does only for locally-stripped objects.
In a locally-stripped
image, the file descriptors are included solely for the purpose of identifying
source file languages.
One file descriptor is present for each source language
involved in the compilation.
These file descriptors will have their
adr
field set to
addressNil
indicating the file descriptors cannot be used to identify text addresses.
Version Note The preceding use of
addressNil
is supported in symbol table format V3.13 and greater. In symbol table formats less than V3.13, the file descriptoradr
value should be ignored.
The procedure descriptor table is present in full but is rearranged to group procedures by source language. All procedure descriptors for procedures written in a particular source language are thus contiguous, and they reflect the file descriptor's information.
External symbols are also present in a locally-stripped image.
The file
indices (ifd
field) of the external symbols are
updated to identify the generic file descriptor for the appropriate source
language.
The index fields are set to zero to indicate that no type information
is available.
External symbols with the storage class
scNil
are removed.
These are debugging symbols that are not normally
produced for minimal symbol tables.
Limited debugging is possible with locally-stripped objects. Because the procedure descriptors are retained, stack traces are possible. External symbol information can also be viewed, and language-dependent handling of symbols (for example, C++ name demangling) is preserved.
A linked executable
file can be locally stripped at any time after its creation using the command
ostrip
-x.
The output is the same as described
above.
This operation may also alter the raw data of the
.comment
section.
See
Chapter 15
for details.
6.3.1.3 (Fully) Stripped Images
Executable files may
be fully stripped at any time after creation using either the
strip
command or the command
ostrip
-s.
Stripping an executable will result in complete removal of the symbol table,
including the symbolic header.
The file header fields
f_symptr
and
f_nsyms
are set to zero to indicate
that the file has been stripped.
This operation may also alter the raw data of the
.comment
section.
See
Chapter 15
for details.
6.3.2 Source File Merging
Much of the complication of source information stems from the "include" system. When a compilation involves several source files, there may be duplication of the header files included in each source file, or of the source files themselves. To avoid repetition of header file information in the linked object, the linker merges the input objects' included files wherever possible. Compilers mark file descriptors as mergeable or unmergeable. The linker then examines the input file descriptors and performs the merge whenever possible.
The linker considers two file descriptors to be mergeable if all of the following criteria are met:
The file descriptor
fMerge
bit
is set in both (marked as mergeable by compiler).
Files have the same name.
Files are written in the same language.
Files contain the same number of local and auxiliary symbols.
The checksums match if either:
Neither file's first auxiliary record is a
btChecksum
.
Both files' first auxiliary record is a
btChecksum
and they are identical.
C++ header files may be divided into separate entries for the mergeable and unmergeable parts of the headers. When this occurs, the unmergeable portion is entered as a file with a mangled name as described in Section 13.3.3.
The role of the relative file descriptor (RFD) tables is to track file-relative information after merging. A relative file descriptor table entry maps the index of each file at compile time to its index after linking. After linking, local or auxiliary symbols must be accessed through the RFD table to obtain the updated file descriptor index. This mechanism is necessary because the indices in the local symbol table are not updated when files are merged.
Figure 6-3 is an example of the use of the relative file descriptor table.
Figure 6-3: Relative File Descriptor Table Example
For a symbol reference composed of a file index and symbol index (offset within file), the relative file descriptor table is used as follows:
To look up given file index in the RFD table to get the updated file index.
To look up new file index in the (merged) file descriptor table to get the base of symbols for that file.
To add symbol index to file's base to access the symbol entry.
See
Section 11.3.2.3
for the representation of relative indices
in the auxiliary symbol table.
6.3.3 Optimization Symbols
Version Note Optimization symbols are supported for symbol table format V3.13. and greater.
The optimization symbols section gives individual producers and consumers the ability to communicate information about any aspect of the object file, in any form they choose. New information can be generated at any time with minimal coordination between all producers and consumers.
The optimization
section is organized on a per-procedure basis.
Each procedure descriptor
has a pointer to the optimization symbols in the field
PDR
.iopt
.
If no optimization symbols are associated with the procedure,
the field contains
ioptNil
.
Otherwise,
it contains the index of the first optimization symbol entry for this procedure.
Consumers should access the optimization symbols through the procedure descriptors.
The optimization section is not present in a locally-stripped object.
This section consists of a sequence of zero or more Per-Procedure Optimization Descriptions (PPODs), as shown in Figure 6-4. Each PPOD's internal structure consists of two parts:
A leading sequence of structured entries using a Tag-Length-Value model to describe subsequent raw data. The structure of the PPOD entry can be found in Section 6.2.3.
The raw data area.
Figure 6-4: Optimization Symbols Section
This section has the following alignment requirements:
Octaword (16-byte) alignment of the beginning of the section.
Octaword (16-byte) alignment of the beginning of the raw data area.
Octaword (16-byte) alignment of each PPOD.
Object file producers
must produce either an empty optimization symbols section or a valid one.
An empty one has the symbolic header fields
cbOptOffset
and
ioptMax
set to zero.
If an
optimization section is present, but a particular file does not contribute
to it, the file descriptor field
copt
is set to
zero.
In this case, all procedure descriptors belonging to the file must have
their
iopt
fields set to
ioptNil
.
Tools that both read and write object files must consume a valid optimization symbols section (if present in the input file) and produce an equivalent and valid section in its output file. If a tool does not know how to process the section contents, the section must be omitted from the output file. If a tool does know how to process portions of the optimization symbols, those portions may be modified and the rest should be removed. The linker concatenates input optimization symbols sections into one output section without reading or modifying any of the entries.
The format and flexible nature of this section are similar by design
to the
.comment
section.
The structures are the same size
and contain the same fields (with different names), and the rules of navigation
are the same.
The primary difference is that the optimization section contains
procedure-specific information; whereas, the comment section contains object-specific
information.