Program analysis tools are extremely important for computer architects and software engineers. Computer architects use them to test and measure new architectural designs, and software engineers use them to identify critical pieces of code in programs or to examine how well a branch prediction or instruction scheduling algorithm is performing. Program analysis tools are needed for problems ranging from basic block counting to data cache simulation. Although the tools that accomplish these tasks may appear quite different, each can be implemented simply and efficiently through code instrumentation.
Atom provides a flexible code instrumentation interface that is capable of building a wide variety of tools. Atom separates the common part in all problems from the problem-specific part by providing machinery for instrumentation and object-code manipulation, and allowing the tool designer to specify what points in a program are to be instrumented. Atom is independent of any compiler and language because it operates on object modules that make up a complete program.
This chapter discusses the following topics:
How to run installed Atom tools and new Atom tools that are still under development (Section 9.1)
How to develop specialized Atom tools (Section 9.2)
The following sections describe how to:
Use installed Atom tools (Section 9.1.1)
Test Atom tools under development (Section 9.1.2)
The Tru64 UNIX operating system provides a number of example
Atom tools, listed in
Table 9-1, to help you develop
your own custom-designed Atom tools.
These tools are distributed in source
form to illustrate Atom's programming interfaces they are not intended
for production use.
Section 9.2
describes some of the tools
in more detail.
Table 9-1: Example Prepackaged Atom Tools
| Tool | Description |
branch |
Instruments all conditional branches to determine how many are predicted correctly. |
cache |
Determines the cache miss rate if an application runs in an 8-KB direct-mapped cache. |
dtb |
Determines the number of dtb (data translation buffer) misses if the application uses 8-KB pages and a fully associative translation buffer. |
dyninst |
Provides fundamental dynamic counts of instructions, loads, stores, blocks, and procedures. |
inline |
Identifies potential candidates for inlining. |
iprof |
Prints the number of times each procedure is called as well as the number of instructions executed by each procedure. |
malloc |
Records each call to the
malloc
function and prints a summary of the application's allocated memory. |
prof |
Prints the number of instructions executed
by each procedure in
pthread
programs. |
ptrace |
Prints the name of each procedure as it is called. |
replace |
Calls a replaced entry point in the application from an analysis routine. |
trace |
Generates an address trace, logs the effective address of every load and store operation, and logs the address of the start of every basic block as it is executed. |
The example tools can be found in the
/usr/lib/cmplrs/atom/examples
directory.
Each one has three files:
An instrumentation file a C source file that uses Atom's API to modify application programs such that additional routines provided by the tool are invoked at particular times during program execution.
An analysis file a C source file that contains the routines that are invoked by the modified program when it is executed. These analysis routines can collect the run-time data that the tool reports.
A description file (toolname.desc) a text file that tells Atom the names of the tool's
instrumentation and analysis files, along with any options that Atom should
use when running the tool.
Atom tools that are put into production use or that are delivered to
customers as products usually have
.o
object modules installed
instead of their proprietary sources.
The Tru64 UNIX
hiprof(1)pixie(1)third(1)/usr/lib/cmplrs/atom/tools.
To run an installed Atom tool or example on an application program,
use the following form of the
atom(1)
atom
application_program -tool toolname
[-env environment]
[ options...
]
This form of the
atom
command requires the
-tool
option and accepts the
-env
option.
The
-tool
option identifies the installed Atom
tool to be used.
By default, Atom searches for installed tools in the
/usr/lib/cmplrs/atom/tools
and
/usr/lib/cmplrs/atom/examples
directories.
You can add directories to the search path by supplying
a colon-separated list of additional directories to the
ATOMTOOLPATH
environment variable.
The
-env
option indicates that an alternative
version of the tool is desired.
For example, some Tru64 UNIX tools require
-env threads
to run the thread-safe version.
The
atom(1)desc
file
instead of the default
toolname.desc
file.
It prints an error message if a description file for the specified environment
cannot be found.
9.1.2 Testing Tools Under Development
A second form of the
atom(1)
atom application_program
instrumentation_file
[analysis_file]
[options...]
This form of the command requires the instrumentation_file parameter and accepts the analysis_file parameter, but not the -tool or -env options.
The
instrumentation_file
parameter specifies
the name of a C source file or an object module that contains the Atom tool's
instrumentation procedures.
If the instrumentation procedures are in more
than one file, the
.o
of each file may be linked together
into one file using the
ld
command with a
-r
option.
By convention, most instrumentation files have the suffix
.inst.c
or
.inst.o.
If you pass an object module for this parameter, consider compiling the module with either the -g1 or-g option. If there are errors in your instrumentation procedures, Atom can issue more complete diagnostic messages when the instrumentation procedures are thus compiled.
The
analysis_file
parameter specifies the
name of a C source file or an object module that contains the Atom tool's
analysis procedures.
If the analysis routines are in more than one file, the
.o
of each file may be linked together into one file using the
ld
command with a
-r
option.
Note that you do not
need to specify an analysis file if the instrumentation file does not call
analysis procedures to the application it instruments.
By convention, most
analysis files have the suffix
.anal.c
or
.anal.o.
Analysis routines may perform better if they are compiled as a single compilation unit.
You can have multiple instrumentation and analysis source files. The following example creates composite instrumentation and analysis objects from several source files:
% cc -c file1.c file2.c % cc -c file7.c file8 % ld -r -o tool.inst.o file1.o file2.o % ld -r -o tool.anal.o file7.o file8.o % atom hello tool.inst.o tool.anal.o -o hello.atom
Note
You can also write analysis procedures in C++. You must assign a type of
extern "C"to each procedure to allow it to be called from the application. You must also compile and link the analysis files before entering theatomcommand. For example:% cxx -c tool.a.C % ld -r -o tool.anal.o tool.a.o -lcxx -lexc % atom hello tool.inst.c tool.anal.o -o hello.atom
With the exception of the
-tool
and
-env
options, both forms of the
atom
command
accept any of the remaining options described in
atom(1)
Causes Atom to optimize calls to analysis routines by reducing the number of registers that need to be saved and restored. For some tools, specifying this option increases the performance of the instrumented application by a factor of two (at the expense of some increase in application size). The default behavior is for Atom not to apply these optimizations.
Instruments all statically loaded shared libraries in the shared executable.
Lets you debug instrumentation
routines with the
dbx
debugger.
Atom transfers control
to the symbolic debugger at the start of the instrumentation routine.
In the
following example, the
ptrace
sample tool is run under
the
dbx
debugger.
The instrumentation is stopped at line
12, and the procedure name is printed.
% atom hello ptrace.inst.c ptrace.anal.c -o hello.ptrace -debug dbx version 3.11.8 Type 'help' for help. Stopped in InstrumentAll (dbx) stop at 12 [4] stop at "/udir/test/scribe/atom.user/tools/ptrace.inst.c":12 (dbx) c [3] [InstrumentAll:12 ,0x12004dea8] if (name == NULL) name = "UNKNOWN"; (dbx) p name 0x2a391 = "__start"
Lets you debug instrumentation
routines with the
ladebug
debugger.
Atom transfers control
to
ladebug
at the start of the instrumentation routine.
Use
ladebug
if the instrumentation routines are threaded
or contain C++ code.
See the
Ladebug Debugger Manual
for more information.
Excludes the named shared library from instrumentation. You can use the -excobj option more than once to specify several shared libraries.
Specifies that fork support is required. Use this option to avoid deadlocks in multithreaded applications.
(-g)Produces the instrumented program with debugging information. This option lets you debug analysis routines with a symbolic debugger. The default -A0 option (not -A1) is recommended with -ga (or -g). For example:
% atom hello ptrace.inst.c ptrace.anal.c -o hello.ptrace -ga
% dbx hello.ptrace
dbx version 3.11.8
Type 'help' for help.
(dbx) stop in ProcTrace
[2] stop in ProcTrace
(dbx) r
[2] stopped at [ProcTrace:5 ,0x120005574] fprintf (stderr,"%s\n",name);
(dbx) n
__start
[ProcTrace:6 ,0x120005598] }
Produces the instrumented
program with debugging information.
This enables debugging of analysis and
application routines.
The prefix "_APP_" is
attached to all variable and procedure names in the application.
The default
-A0
option (not
-A1) is recommended when
-gpa
is used.
Produces the instrumented program with debugging information. This option lets you debug application routines with a symbolic debugger.
Produces the instrumented
program with debugging information.
This enables debugging of analysis and
application routines.
The prefix "_ANA_" is
attached to all variable and procedure names in the analysis object.
The default
-A0
option (not
-A1) is recommended when
-gpa
is used.
Changes the base of the analysis heap. Use the -heapbase option if the default analysis heap location conflicts with an address range used by the application program. For the new base, you can choose either a new hex address location, a default 31-bit address location, or the first page after the end of the application's bss segment.
Allows reuse (or incremental instrumentation) of a previously instrumented shared library.
Instruments the named shared library. You can use the -incobj option more than once to specify several shared libraries.
Indicates that the temporary files that Atom creates are to be placed in the current working directory and not deleted when instrumentation is complete.
Changes the library directory search order for shared object libraries so that Atom never looks for them in the default library directories. Use this option when the default library directories should not be searched and only the directories specified by -Ldir are to be searched.
Changes the library directory search order for shared object libraries so that Atom searches for them in dir before searching the default library directories. You can specify multiple -Ldir options to specify several directory names.
Produces a list of the starting addresses of the sections in the instrumented executable.
Specifies a name for the executable output file.
-pthreadSpecifies that thread-safe support is required. Use this option when instrumenting threaded applications.
Specifies an existing directory to which Atom writes the instrumented shared libraries.
Specifies a filename suffix that is appended to the name of each object when Atom writes the instrumented version.
Passes arguments to
the Atom tool's instrumentation routine.
Atom passes the arguments in the
same way that they are passed to C programs, using the
argc
and
argv
arguments to the
main
program.
For example:
#include <stdio.h>
unsigned InstrumentAll(int argc, char **argv) {
int i;
for (i = 0; i < argc; i++) {
printf(stderr,"argv[%d]: %s\n",argv[i]);
}
}
The following example shows how Atom passes the
-toolargs
arguments:
% atom hello args.inst.c -toolargs="8192 4" argv[0]: hello argv[1]: 8192 argv[2]: 4
Displays each step Atom takes to create the instrumented program.
Displays Atom's version number.
Displays all warning messages, including those that are normally suppressed.
Suppresses warning messages that can be safely ignored. This is the default.
Suppresses warning messages emitted when processing the analysis routines.
Suppresses warning messages about shared library processing errors.
Passes the specified options to the analysis file's link and compilation phases, respectively.
Passes the specified options to the instrumentation file's link and compilation phases, respectively.
The remainder of this chapter describes how to develop atom tools.
9.2.1 Atom's View of an Application
Atom views an application as a hierarchy of components:
The program, including the executable and all shared libraries.
A collection of objects. An object can be either the main executable or any shared library. An object has its own set of attributes (such as its name) and consists of a collection of procedures.
A collection of procedures, each of which consists of a collection of entry points and a collection of basic blocks.
A collection of basic blocks, each of which consists of a collection of instructions.
A collection of instructions.
Atom tools insert instrumentation points in an application program at procedure, entry point, basic block, or instruction boundaries. For example, basic block counting tools instrument the beginning of each basic block, data cache simulators instrument each load and store instruction, and branch prediction analyzers instrument each conditional branch instruction.
At any instrumentation point, Atom allows a tool to insert a call to
an analysis routine.
The tool can specify that the call be made before or
after an object, procedure, entry point, basic block, or instruction.
9.2.2 Atom Instrumentation Routine
A tool's instrumentation routine contains the code that traverses the application's objects, procedures, entry points, basic blocks, and instructions to locate instrumentation points; adds calls to analysis procedures; and builds the instrumented version of an application.
As described in
atom_instrumentation_routines(5)
Instrument
(int
iargc,
char **iargv,
Obj *obj)
Atom calls the
Instrument
routine for each
object in the application program.
As a result, an
Instrument
routine does not need to use the object navigation routines (such as
GetFirstObj).
Because Atom automatically writes each modified object
before passing the next to the
Instrument
routine, the
Instrument
routine should never call the
BuildObj,
WriteObj, or
ReleaseObj
routine.
When using the
Instrument
interface, you can define an
InstrumentInit
routine to perform tasks required before Atom calls
Instrument
for the first object (such as defining analysis routine prototypes,
adding program level instrumentation calls, and performing global initializations).
You can also define an
InstrumentFini
routine to perform
tasks required after Atom calls
Instrument
for the last
object (such as global cleanup).
InstrumentAll
(int
iargc,
char **iargv)
Atom calls the
InstrumentAll
routine once for the entire
application program, which allows a tool's instrumentation code itself to
determine how to traverse the application's objects.
With this method, there
are no
InstrumentInit
or
InstrumentFini
routines.
An
InstrumentAll
routine must call the Atom object
navigation routines and use the
BuildObj,
WriteObj, or
ReleaseObj
routine to manage the application's
objects.
Regardless of the instrumentation routine interface, Atom passes the
arguments specified in the
-toolargs
option to the routine.
In the case of the
Instrument
interface, Atom also passes
a pointer to the current object.
9.2.3 Atom Instrumentation Interfaces
Atom provides a comprehensive interface for instrumenting applications. The interface supports the following types of activities:
Navigating among a program's objects, procedures, entry points, basic blocks, and instructions. See Section 9.2.3.1.
Building, releasing, and writing objects. See Section 9.2.3.2.
Obtaining information about the different components of an application. See Section 9.2.3.3.
Resolving names and call targets. See Section 9.2.3.4.
Adding calls to analysis routines at desired locations in the program. See Section 9.2.3.5.
Intercepting calls to entry points in a program. See Section 9.2.3.6.
9.2.3.1 Navigating Within a Program
The Atom application navigation routines, described in
atom_application_navigation(5)
The
GetFirstObj,
GetLastObj,
GetNextObj, and
GetPrevObj
routines navigate
among the objects of a program.
For nonshared programs, there is only one
object.
For call-shared programs, the first object corresponds to the main
program.
The remaining objects are each of its dynamically linked shared libraries.
The
GetFirstObjProc
and
GetLastObjProc
routines return a pointer to the first or last procedure, respectively,
in the specified object.
The
GetNextProc
and
GetPrevProc
routines navigate among the procedures of an object.
The
GetFirstEntry
and
GetLastEntry
routines return a pointer to the first or last entry point, respectively,
in the specified procedure.
The
GetNextEntry
and
GetPrevEntry
routines navigate among the entry points of a procedure.
The
GetFirstBlock,
GetLastBlock,
GetNextBlock, and
GetPrevBlock
routines navigate among the basic blocks of a procedure.
The
GetFirstInst,
GetLastInst,
GetNextInst, and
GetPrevInst
routines navigate among the instructions of a basic block.
The
GetInstBranchTarget
routine returns
a pointer to the instruction that is the target of a specified branch instruction.
The
GetProcObj
routine returns a pointer
to the object that contains the specified procedure.
The
GetEntryProc
and
GetBlockProc
routines return a pointer to the procedure that contains the specified
entry point or basic block, respectively.
The
GetEntryBlock
routine returns a pointer
to the first basic block in the specified entry point.
The
GetInstBlock
routine returns a pointer to the basic block that contains the
specified instruction.
The Atom object management routines, described in
atom_object_management(5)InstrumentAll
routine to build, write,
and release objects.
The
BuildObj
routine builds the internal data structures
Atom requires to manipulate the object.
An
InstrumentAll
routine must call the
BuildObj
routine before traversing
the procedures in the object and adding analysis routine calls to the object.
The
WriteObj
routine writes the instrumented version of
the specified object, deallocating the internal data structures the
BuildObj
routine previously created.
The
ReleaseObj
routine deallocates the internal data structures for the given object, but
it does not write out the instrumented version the object.
The
IsObjBuilt
routine returns a nonzero value if
the specified object has been built with the
BuildObj
routine
but not yet written with the
WriteObj
routine or unbuilt
with the
ReleaseObj
routine.
9.2.3.3 Obtaining Information About an Application's Components
The Atom application query routines, described in
atom_application_query(5)
Table 9-2
lists the routines that provide information
about a program.
Table 9-2: Atom Program Query Routines
| Routine | Description |
GetAnalName |
Returns the name of the analysis file, as passed to
the
atom
command.
This routine is useful for tools that
have a single instrumentation file and multiple analysis files. |
GetErrantShlibName |
Returns the name of the shared library that Atom was unable to process. |
GetErrantShlibErr |
Returns an error code to explain why Atom was unable to process the shared library. |
GetProgInfo |
Returns the number of objects in a program, the number of objects that the tool writer requested to be instrumented, or the number of shared libraries that Atom was unable to process. |
Table 9-3
lists the routines that provide information
about a program's objects.
Table 9-3: Atom Object Query Routines
| Routine | Description |
GetObjInfo |
Returns information about an object's text, data, and bss segments; the number of procedures, entry points, basic blocks, or instructions it contains; its object ID; information about how the object was linked; or a Boolean hint as to whether the given object should be excluded from instrumentation. |
GetObjInstArray |
Returns an array consisting of the 32-bit instructions included in the object. |
GetObjInstCount |
Returns the number of instructions
in the array included in the array returned by the
GetObjInstArray
routine. |
GetObjName |
Returns the original file name of the specified object. |
GetObjOutName |
Returns the name of the instrumented object. |
The following instrumentation routine, which prints statistics about the program's objects, demonstrates the use of Atom object query routines:
1 #include <stdio.h>
2 #include <cmplrs/atom.inst.h>
3 unsigned InstrumentAll(int argc, char **argv)
4 {
5 Obj *o; Proc *p;
6 const unsigned int *textSection;
7 long textStart;
8 for (o = GetFirstObj(); o != NULL; o = GetNextObj(o)) {
9 BuildObj(o);
10 textSection = GetObjInstArray(o);
11 textStart = GetObjInfo(o,ObjTextStartAddress);
12 printf("Object %d\n", GetObjInfo(o,ObjID));
13 printf(" Object name: %s\n", GetObjName(o));
14 printf(" Text segment start: 0x%lx\n", textStart);
15 printf(" Text size: %ld\n", GetObjInfo(o,ObjTextSize));
16 printf(" Second instruction: 0x%x\n", textSection[1]);
17 ReleaseObj(o);
18 }
19 return(0);
20 }
Because the instrumentation routine adds no procedures to the executable, there is no need for an analysis procedure. The following example demonstrates the process of compiling and instrumenting a program with this tool. A sample run of the instrumented program prints the object identifier, the compile-time starting address of the text segment, the size of the text segment, and the binary for the second instruction. The disassembler provides a convenient method for finding the corresponding instructions.
% cc hello.c -o hello % atom hello info.inst.c -o hello.info Object 0 Object Name: hello Start Address: 0x120000000 Text Size: 8192 Second instruction: 0x239f001d Object 1 Object Name: /usr/shlib/libc.so Start Address: 0x3ff80080000 Text Size: 901120 Second instruction: 0x239f09cb % dis hello | head -3 0x120000fe0: a77d8010 ldq t12, -32752(gp) 0x120000fe4: 239f001d lda at, 29(zero) 0x120000fe8: 279c0000 ldah at, 0(at) % dis /ust/shlib/libc.so | head -3 0x3ff800bd9b0: a77d8010 ldq t12,-32752(gp) 0x3ff800bd9b4: 239f09cb lda at,2507(zero) 0x3ff800bd9b8: 279c0000 ldah at, 0(at)
Table 9-4
lists the routines that provide information
about an object's procedures.
Table 9-4: Atom Procedure Query Routines
| Routine | Description |
GetProcInfo |
Returns information pertaining to the procedure's stack frame, register-saving, register-usage, and prologue characteristics as defined in the Calling Standard for Alpha Systems manual and the Assembly Language Programmer's Guide. Such values are important to tools, like Third Degree, that monitor the stack for access to uninitialized variables. It can also return information about the procedure such as the number of entry points, basic blocks, or instructions it contains; its procedure ID and type of symbol resolution; its lowest or highest source line number; procedure offset to the first instruction; procedure offset to the instruction that modifies the stack pointer; whether it has any alternate entry points; whether it has any interprocedural branches or jumps; and whether its address has been taken. |
ProcGP |
Returns the global pointer (GP) of the procedure. |
ProcFileName |
Returns the name of the source file that contains the procedure. |
ProcName |
Returns the procedure's name. |
ProcPC |
Returns the compile-time program counter (PC) of the first instruction in the procedure. |
Table 9-5
list the routines that provide
information about an object's main and alternate entry points.
Table 9-5: Atom Entry Point Query Routines
| Routine | Description |
EntryName |
Returns the entry point's name. |
EntryPC |
Returns the compile-time program counter (PC) of the first instruction of the entry point. |
GetEntryInfo |
Returns information about the entry point such as whether it is redundant. |
GetEntryProc |
Returns the enclosing procedure of the entry point. |
Table 9-6
lists the routines that provide information
about a procedure's basic blocks.
Table 9-6: Atom Basic Block Query Routines
| Routine | Description |
BlockPC |
Returns the compile-time program counter (PC) of the first instruction in the basic block. |
GetBlockInfo |
Returns the number of instructions in the basic block or the block ID. |
IsBranchTarget |
Indicates if the block is the target of a branch instruction. |
Table 9-7
lists the routines that provide information
about a basic block's instructions.
Table 9-7: Atom Instruction Query Routines
| Routine | Description |
GetInstBinary |
Returns a 32-bit binary representation of the assembly language instruction. |
GetInstClass |
Returns the instruction class (for example, floating-point load or integer store) as defined by the Alpha Architecture Reference Manual. |
GetInstInfo |
Parses the entire 32-bit instruction and obtains all or a portion of that instruction. |
GetInstRegEnum |
Returns the register type (floating-point
or integer) from an instruction field as returned by the
GetInstInfo
routine. |
GetInstRegUsage |
Returns a bit mask with one bit set for each possible source register and one bit set for each possible destination register. |
InstLineNo |
Returns the instruction's source line number. |
InstPC |
Returns the compile-time program counter (PC) of the instruction. |
IsInstType |
Indicates whether the instruction is of the specified type (load instruction, store instruction, conditional branch, or unconditional branch). |
9.2.3.4 Resolving Names and Call Targets
Atom's symbol resolution routines, described in atom_application_symbols(5), allow an Atom tool's instrumentation routine to find and get application objects, procedures, entry points, and instructions that are either named or targets of a call site:
The
FindObj
and
FindObjDepthFirst
routines search (breadth-first and depth-first, respectively) for
the named entry point in all of the objects and returns the object that contains
the named entry point.
The
FindProc
routine returns the procedure
that contains a main or alternate entry point with the specified name in the
given object.
The
FindEntry
routine returns the main
or alternate entry point with the specified name in the given object.
The
FindInst
routine returns the first
instruction at the address of the resolved symbol with the specified name
in the given object
The
GetTargetName
routine returns the name
of the entry point that is the target of the procedure call.
The
GetTargetObj,
GetTargetProc, and
GetTargetEntry, routines return, respectively,
the object, procedure, or entry point, that contains the target of the procedure
call.
The
GetTargetInst
routine returns the instruction
to which the specified instruction jumps or branches.
9.2.3.5 Adding Calls to Analysis Routines
The Atom application instrumentation routines, described in
atom_application_instrumentation(5)
You must use the
AddCallProto
routine to
specify the prototype of each analysis procedure to be added to the program.
In other words, an
AddCallProto
call must define the procedural
interface for each analysis procedure used in calls to
AddCallProgram,
AddCallObj,
AddCallProc,
AddCallEntry,
AddCallBlock, and
AddCallInst.
Atom provides facilities for passing application and analysis
data into the added procedure as either constants, register contents, address
translation structures, or computed values such as effective addresses and
branch conditions.
Use the
AddCallProgram
routine in an instrumentation
routine to add a call to an analysis procedure before a program starts execution
or after it completes execution.
Typically such an analysis procedure does
something that applies to the whole program, such as opening an output file
or parsing command-line options.
Use the
AddCallObj
routine in an instrumentation
routine to add a call to an analysis procedure before an object starts execution
or after it completes execution.
Typically such an analysis procedure does
something that applies to the single object, such as initializing some data
for its procedures.
Use the
AddCallProc
routine in an instrumentation
routine to add a call to an analysis procedure before a procedure starts execution
or after it completes execution.
Use the
AddCallEntry
routine in an instrumentation
routine to add a call to an analysis procedure before a main or alternate
entry point starts execution.
Use the
AddCallBlock
routine in an instrumentation
routine to add a call to an analysis procedure before a basic block starts
execution or after it completes execution.
Use the
AddCallInst
routine in an instrumentation
routine to add a call to an analysis procedure before a given instruction
executes or after it executes.
Use the
ReplaceProcedure
routine to replace
a procedure in the instrumented program.
For example, the Third Degree Atom
tool replaces memory allocation functions such as
malloc
and
free
with its own versions to allow it to check for
invalid memory accesses and memory leaks.
9.2.3.6 Intercepting Calls to Entry Points
Atom's call interception routines, described in
atom_application_instrumentation(5)
You must use the
ReplaceProto
routine to
specify the prototype of each replacement analysis routine to be called instead
of the replaced entry point.
In other words, a
ReplaceProto
call must define the procedural interface for each analysis routine referenced
in calls to the
ReplaceEntry
routine.
Atom provides facilities
for passing application and analysis data into the replacement analysis routine
as either constants, register contents, address translation structures, or
computed values such as the run-time address of the replaced entry.
Use the
ReplaceEntry
routine to replace
a main or alternate entry point in the instrumented program with a call to
a replacement analysis routine.
A
ReplaceEntry
call must
specify the entry point to replace, the replacement analysis routine, and
the procedural interface arguments prototyped in
ReplaceProto.
Only the specified entry point is replaced.
Other entry points in the same
application can be replaced by using additional
ReplaceEntry
calls.
The replacement routine performs the analysis and can emulate the replaced
entry point by calling it using a computed
ReplAddrValue
value.
For an example of this emulation, see
Section 9.2.6.
For example, the following pair of
ReplaceProto
and
ReplaceEntry
calls intercepts
memcpy(3)my_memcpy
with three application
arguments and two analysis arguments (the return address of the application's
call to
memcpy()
and the replaced entry point's run-time
address):
ReplaceProto ("my_memcpy(VALUE, VALUE, VALUE, REGV, VALUE)");
ReplaceEntry (FindEntry(obj,"memcpy"), /* entry point to replace */
"my_memcpy", /* replacement analysis routine */
ArgValue, /* application argument */
ArgValue, /* application argument */
ArgValue, /* application argument */
REG_RA, /* analysis argument */
ReplAddrValue)); /* analysis argument */
The associated replacement analysis routine would be declared with three application and two analysis arguments:
void * my_memcpy (void * s1, const void * s2, size_t n, long call_address,
void * (*memcpy_ptr) (void *,const void *,size_t));
An Atom tool's description file, as described in
atom_description_file(5)cc,
ld, and
atom
commands when it is compiled, linked,
and invoked.
Each Atom tool must supply at least one description file.
There are two types of Atom description file:
A description file providing an environment for generalized use of the tool. A tool can provide only one general-purpose environment. The name of this type of description file has the following format:
tool.desc
A description file providing an environment for use of the tool in specific contexts, such as in a multithreaded application or in kernel mode. A tool can provide several special-purpose environments, each of which has its own description file. The name of this type of description file has the following format:
tool.environment.desc
The names supplied for the
tool
and
environment
portions of these description file names correspond
to values the user specifies to the
-tool
and
-env
options of an
atom
command when invoking
the tool.
An Atom description file is a text file containing a series of tags
and values.
See
atom_description_file(5)9.2.5 Writing Analysis Procedures
An instrumented application calls analysis procedures to perform the specific functions defined by an Atom tool. An analysis procedure can use system calls or library functions, even if the same call or function is instrumented within the application. The routines used by the analysis routine and the instrumented application are physically distinct. The following library routines can and cannot be called by analysis routines:
Standard C Library (libc.a) routines (including
system calls) can be called, except for:
Also, the standard I/O routines have certain differences in behavior, as described in Section 9.2.5.1.
The
pthread_atfork(3)
Math Library (libm.a) routines can be called.
Other routines related to multithreading or exception-handling
should not be called (for example,
pthread(3)exc_*, and
libmach
routines).
Other routines that assume a particular environment (for example, X and Motif) may not be useful or correct in an Atom analysis environment.
Thread Local Storage (TLS) is not
supported in analysis routines.
9.2.5.1 Input/Output
The standard I/O library provided to analysis routines does not
automatically flush and close streams when the instrumented program terminates,
so the analysis code must flush or close them explicitly when all output has
been completed.
Also, the
stdout
and
stderr
streams that are provided to analysis routines will be closed when the application
calls
exit(), so analysis code may need to duplicate one
or both of these streams if they need to be used after application exit (for
example, in a
ProgramAfter
or
ObjAfter
analysis routine.
See the
prof
tool described in
Section 9.1.1
for an example of how to open an additional
stream for I/O.
For output to
stderr
(or a duplicate of
stderr) to appear immediately, analysis code should call
setbuf(stream,NULL)
to make the stream unbuffered or call
fflush
after each set of
fprintf
calls.
Similarly,
analysis routines using C++ streams can call
cerr.flush().
9.2.5.2 fork and exec System Calls
If a process calls a
fork
function but does not call
an
exec
function, the process is cloned and the child inherits
an exact copy of the parent's state.
In many cases, an Atom tool expects this
behavior.
For example, an instruction-address tracing tool sees references
for both the parent and the child, interleaved in the order in which the references
occurred.
In the case of an instruction-profiling tool (for example, the
trace
tool referenced in
Table 9-1), the
file is opened at a
ProgramBefore
instrumentation point
and, as a result, the output file descriptor is shared between the parent
and the child processes.
If the results are printed at a
ProgramAfter
instrumentation point, the output file contains the parent's data,
followed by the child's data (assuming that the parent process finishes first).
For tools that count events (for example, the
prof
tool referenced in
Table 9-1), the data structures
that hold the counts should be returned to zero in the child process after
the
fork
call because the events occurred in the parent,
not the child.
This type of Atom tool can support correct handling of
fork
calls by instrumenting the
fork
library
procedure and calling an analysis procedure with the return value of the
fork
routine as an argument.
If the analysis procedure is passed
a return value of 0 (zero) in the argument, it knows that it was called from
a child process.
It can then reset the counts variable or other data structures
so that they tally statistics only for the child process.
9.2.6 Calling Application Entry Points That Have Been Replaced
Entry points that have been replaced may be called from the replacement
analysis routine by using the
ReplAddrValue
parameter that
was passed in during the
ReplaceEntry
call.
The
ReplAddrValue
argument contains the address of the replaced entry
point, as described in
Section 9.2.3.6.
This address
provides a way for the replacement analysis routine to emulate the replaced
entry point by simply calling it.
In the following example, the function
memcpy()
is
replaced with the analysis function
my_memcpy(), which
in turn calls the replaced
memcpy()
function.
The following source listing contains the instrumentation code:
#include <string.h>
#include <cmplrs/atom.inst.h>
unsigned InstrumentAll (int argc, char **argv)
{
Xlate * px;
Obj * o;
Entry * e;
/*
* Prototype the replacement routine.
*/
ReplaceProto("my_memcpy(VALUE, VALUE, VALUE, REGV, VALUE)");
/*
* Resolve the object that contains memcpy().
*/
o = FindObj("memcpy");
if (o) {
/*
* Build the object containing memcpy so the memcpy entry point
* can be resolved and replaced.
*/
if (BuildObj(o)) return(1);
/*
* Resolve the memcpy entry point.
*/
e = FindEntry(o, "memcpy");
/*
* Prefix the memcpy entry point with atom-generated code to
* call the analysis routine my_memcpy instead.
*/
ReplaceEntry(e, "my_memcpy",
ArgValue, ArgValue, ArgValue, REG_RA, ReplAddrValue);
/*
* Write the instrumented object.
*/
WriteObj(o);
}
return (0);
}
The following source listing contains the analysis code:
#include <stdio.h>
#include <stdlib.h>
#include <cmplrs/atom.anal.h>
/*
* Replacement routine for memcpy();
*/
void * my_memcpy (void * s1, const void * s2, size_t n, long call_address,
void * (*memcpy_ptr) (void *, const void *, size_t))
{
void * ptr = 0;
/*
* Report the call.
*/
printf ("memcpy called from %lx\n", call_address);
/*
* Call the original memcpy().
*/
ptr = (*memcpy_ptr) (s1, s2, n);
return (ptr);
}
9.2.7 Determining the Instrumented PC from an Analysis Routine
Atom's address translation routines, described in
Xlate(5)
To provide the instruction's address in an address translation buffer,
the instrumentation routine must first call the
CreateXlate
routine.
After the address translation buffer has been created, the instrumentation
routine adds an instruction's address to it by calling the
AddXlateAddress
routine.
Addresses of entry points can also be added to an address
translation buffer by calling the
AddXlateEntry
routine.
An address translation buffer can only hold addresses from a single object.
An Atom tool's instrumentation routine passes an address translation
buffer to an analysis routine by passing it as a parameter of type
XLATE *, as indicated in the analysis routine's prototype definition
in an
AddCallProto
call.
Another way to determine an instrumented PC is to specify a formal parameter
type of
REGV
in an analysis routine's prototype and pass
the
REG_IPC
value.
An Atom tool's analysis routine uses the following analysis interfaces to access an address translation buffer passed to it:
The
XlateNum
routine returns the number
of addresses in the specified address translation buffer.
The
XlateInstTextStart
routine returns
the starting address of the text segment for the instrumented object corresponding
to the specified address translation buffer.
The
XlateInstTextSize
routine returns the
size of the text segment.
The
XlateLoadShift
routine returns the
difference between the run-time addresses in the object corresponding to the
specified address translation buffer and the compile-time addresses.
The
XlateAddr
routine returns the instrumented
run-time address for the instruction in the specified position of the specified
address translation buffer.
Note that the run-time address for an instruction
in a shared library is not necessarily the same as its compile-time address.
The following example demonstrates the use of the
Xlate
routines by the instrumentation and analysis files of a tool that uses the
Xlate
routines.
This tool prints the target address of every jump
instruction.
To use it, enter the following command:
% atom progname xlate.inst.c xlate.anal.c -all
The following source listing (xlate.inst.c) contains
the instrumentation for the
xlate
tool:
#include <stdlib.h>
#include <stdio.h>
#include <alpha/inst.h>
#include <cmplrs/atom.inst.h>
static void address_add(unsigned long);
static unsigned address_num(void);
static unsigned long * address_paddrs(void);
static void address_free(void);
void InstrumentInit(int iargc, char **iargv)
{
/* Create analysis prototypes. */
AddCallProto("RegisterNumObjs(int)");
AddCallProto("RegisterXlate(int, XLATE *, long[0])");
AddCallProto("JmpLog(long, REGV)");
/* Pass the number of objects to the analysis routines. */
AddCallProgram(ProgramBefore, "RegisterNumObjs",
GetProgInfo(ProgNumberObjects));
}
Instrument(int iargc, char **iargv, Obj *obj)
{
Proc * p;
Block * b;
Inst * i;
Xlate * pxlt;
union alpha_instruction bin;
ProcRes pres;
unsigned long pc;
char proto[128];
/*
* Create an XLATE structure for this Obj. We use this to translate
* instrumented jump target addresses to pure jump target addresses.
*/
pxlt = CreateXlate(obj, XLATE_NOSIZE);
for (p = GetFirstObjProc(obj); p; p = GetNextProc(p)) {
for (b = GetFirstBlock(p); b; b = GetNextBlock(b)) {
/*
* If the first instruction in this basic block has had its
* address taken, it's a potential jump target. Add the
* instruction to the XLATE and keep track of the pure address
* too.
*/
i = GetFirstInst(b);
if (GetInstInfo(i, InstAddrTaken)) {
AddXlateAddress(pxlt, i);
address_add(InstPC(i));
}
for (; i; i = GetNextInst(i)) {
bin.word = GetInstInfo(i, InstBinary);
if (bin.common.opcode == op_jsr &&
bin.j_format.function == jsr_jmp)
{
/*
* This is a jump instruction. Instrument it.
*/
AddCallInst(i, InstBefore, "JmpLog", InstPC(i),
GetInstInfo(i, InstRB));
}
}
}
}
/*
* Re-prototype the RegisterXlate() analysis routine now that we
* know the size of the pure address array.
*/
sprintf(proto, "RegisterXlate(int, XLATE *, long[%d])", address_num());
AddCallProto(proto);
/*
* Pass the XLATE and the pure address array to this object.
*/
AddCallObj(obj, ObjBefore, "RegisterXlate", GetObjInfo(obj, ObjID),
pxlt, address_paddrs());
/*
* Deallocate the pure address array.
*/
address_free();
}
/*
** Maintains a dynamic array of pure addresses.
*/
static unsigned long * pAddrs;
static unsigned maxAddrs = 0;
static unsigned nAddrs = 0;
/*
** Add an address to the array.
*/
static void address_add(
unsigned long addr)
{
/*
* If there's not enough room, expand the array.
*/
if (nAddrs >= maxAddrs) {
maxAddrs = (nAddrs + 100) * 2;
pAddrs = realloc(pAddrs, maxAddrs * sizeof(*pAddrs));
if (!pAddrs) {
fprintf(stderr, "Out of memory\n");
exit(1);
}
}
/*
* Add the address to the array.
*/
pAddrs[nAddrs++] = addr;
}
/*
** Return the number of elments in the address array.
*/
static unsigned address_num(void)
{
return(nAddrs);
}
/*
** Return the array of addresses.
*/
static unsigned long *address_paddrs(void)
{
return(pAddrs);
}
/*
** Deallocate the address array.
*/
static void address_free(void)
{
free(pAddrs);
pAddrs = 0;
maxAddrs = 0;
nAddrs = 0;
}
The following source listing (xlate.anal.c) contains
the analysis routine for the
xlate
tool:
#include <stdlib.h>
#include <stdio.h>
#include <cmplrs/atom.anal.h>
/*
* Each object in the application gets one of the following data
* structures. The XLATE contains the instrumented addresses for
* all possible jump targets in the object. The array contains
* the matching pure addresses.
*/
typedef struct {
XLATE * pXlt;
unsigned long * pAddrsPure;
} ObjXlt_t;
/*
* An array with one ObjXlt_t structure for each object in the
* application.
*/
static ObjXlt_t * pAllXlts;
static unsigned nObj;
static int translate_addr(unsigned long, unsigned long *);
static int translate_addr_obj(ObjXlt_t *, unsigned long,
unsigned long *);
/*
** Called at ProgramBefore. Registers the number of objects in
** this application.
*/
void RegisterNumObjs(
unsigned nobj)
{
/*
* Allocate an array with one element for each object. The
* elements are initialized as each object is loaded.
*/
nObj = nobj;
pAllXlts = calloc(nobj, sizeof(pAllXlts));
if (!pAllXlts) {
fprintf(stderr, "Out of Memory\n");
exit(1);
}
}
/*
** Called at ObjBefore for each object. Registers an XLATE with
** instrumented addresses for all possible jump targets. Also
** passes an array of pure addresses for all possible jump targets.
*/
void RegisterXlate(
unsigned iobj,
XLATE * pxlt,
unsigned long * paddrs_pure)
{
/*
* Initialize this object's element in the pAllXlts array.
*/
pAllXlts[iobj].pXlt = pxlt;
pAllXlts[iobj].pAddrsPure = paddrs_pure;
}
/*
** Called at InstBefore for each jump instruction. Prints the pure
** target address of the jump.
*/
void JmpLog(
unsigned long pc,
REGV targ)
{
unsigned long addr;
printf("0x%lx jumps to - ", pc);
if (translate_addr(targ, &addr))
printf("0x%lx\n", addr);
else
printf("unknown\n");
}
/*
** Attempt to translate the given instrumented address to its pure
** equivalent. Set '*paddr_pure' to the pure address and return 1
** on success. Return 0 on failure.
**
** Will always succeed for jump target addresses.
*/
static int translate_addr(
unsigned long addr_inst,
unsigned long * paddr_pure)
{
unsigned long start;
unsigned long size;
unsigned i;
/*
* Find out which object contains this instrumented address.
*/
for (i = 0; i < nObj; i++) {
start = XlateInstTextStart(pAllXlts[i].pXlt);
size = XlateInstTextSize(pAllXlts[i].pXlt);
if (addr_inst >= size && addr_inst < start + size) {
/*
* Found the object, translate the address using that
* object's data.
*/
return(translate_addr_obj(&pAllXlts[i], addr_inst,
paddr_pure));
}
}
/*
* No object contains this address.
*/
return(0);
}
/*
** Attempt to translate the given instrumented address to its
** pure equivalent using the given object's translation data.
** Set '*paddr_pure' to the pure address and return 1 on success.
** Return 0 on failure.
*/
static int translate_addr_obj(
ObjXlt_t * pObjXlt,
unsigned long addr_inst,
unsigned long * paddr_pure)
{
unsigned num;
unsigned i;
/*
* See if the instrumented address matches any element in the XLATE.
*/
num = XlateNum(pObjXlt->pXlt);
for (i = 0; i < num; i++) {
if (XlateAddr(pObjXlt->pXlt, i) == addr_inst) {
/*
* Matches this XLATE element, return the matching pure
* address.
*/
*paddr_pure = pObjXlt->pAddrsPure[i];
return(1);
}
}
/*
* No match found, must not be a possible jump target.
*/
return(0);
}
This section describes the basic tool-building interface by using three
simple examples: procedure tracing, instruction profiling, and data cache
simulation.
9.2.8.1 Procedure Tracing
The
ptrace
tool prints the names of procedures in
the order in which they are executed.
The implementation adds a call to each
procedure in the application.
By convention, the instrumentation for the
ptrace
tool is placed in the file
ptrace.inst.c.
For example:
1 #include <stdio.h>
2 #include <cmplrs/atom.inst.h> [1]
3
4 unsigned InstrumentAll(int argc, char **argv) [2]
5 {
6 Obj *o; Proc *p;
7 AddCallProto("ProcTrace(char *)"); [3]
8 for (o = GetFirstObj(); o != NULL; o = GetNextObj(o)) { [4]
9 if (BuildObj(o)) return 1; [5]
10 for (p = GetFirstObjProc(o); p != NULL; p = GetNextProc(p)) { [6]
11 const char *name = ProcName(p); [7]
12 if (name == NULL) name = "UNKNOWN"; [8]
13 AddCallProc(p,ProcBefore,"ProcTrace",name); [9]
14 }
15 WriteObj(o); [10]
16 }
17 return(0);
18 }
Includes the definitions for Atom instrumentation routines and data structures. [Return to example]
Defines the
InstrumentAll
procedure.
This instrumentation routine defines the interface to each analysis
procedure and inserts calls to those procedures at the correct locations in
the applications it instruments.
[Return to example]
Calls the
AddCallProto
routine to define the
ProcTrace
analysis procedure.
ProcTrace
takes a single argument of type
char *.
[Return to example]
Calls the
GetFirstObj
and
GetNextObj
routines to cycle through each object in
the application.
If the program was linked nonshared, there is only a single
object.
If the program was linked call-shared, it contains multiple objects:
one for the main executable and one for each dynamically linked shared library.
The main program is always the first object.
[Return to example]
Builds the first object.
Objects must be
built before they can be used.
In very rare circumstances, the object cannot
be built.
The
InstrumentAll
routine reports this condition
to Atom by returning a nonzero value.
[Return to example]
Calls the
GetFirstObjProc
and
GetNextProc
routines to step through each procedure
in the application program.
[Return to example]
For each procedure, calls the
ProcName
procedure to find the procedure name.
Depending on the
amount of symbol table information that is available in the application, some
procedures names, such as those defined as
static, may
not be available.
(Compiling applications with the
-g1
option provides this level of symbol information.) In these cases, Atom returns
NULL.
[Return to example]
Converts the
NULL
procedure
name string to
UNKNOWN.
[Return to example]
Calls the
AddCallProc
routine to add a call to the procedure pointed to by
p.
The
ProcBefore
argument indicates that the analysis procedure
is to be added before all other instructions in the procedure.
The name of
the analysis procedure to be called at this instrumentation point is
ProcTrace.
The final argument is to be passed to the analysis procedure.
In this case, it is the procedure named obtained on line 11.
[Return to example]
Writes the instrumented object file to disk. [Return to example]
The instrumentation file added calls to the
ProcTrace
analysis procedure.
This procedure is defined in the analysis file
ptrace.anal.c
as shown in the following example:
1 #include <stdio.h>
2
3 void ProcTrace(char *name)
4 {
5 fprintf(stderr, "%s\n",name);
6 }
The
ProcTrace
analysis procedure prints, to
stderr, the character string passed to it as an argument.
Note that
an analysis procedure cannot return a value.
After the instrumentation and analysis files are specified, the tool is complete. To demonstrate the application of this tool, compile and link the following application as follows:
#include <stdio.h>
main()
{
printf("Hello world!\n");
}
The following example builds a nonshared executable, applies the
ptrace
tool, and runs the instrumented executable.
This simple program
calls almost 30 procedures.
% cc -non_shared hello.c -o hello
% atom hello ptrace.inst.c ptrace.anal.c -o hello.ptrace
% hello.ptrace
__start
main
printf
_doprnt
__getmbcurmax
strchr
strlen
memcpy
.
.
.
The following example repeats this process with the application linked
call-shared.
The major difference is that the
LD_LIBRARY_PATH
environment variable must be set to the current directory because Atom creates
an instrumented version of the
libc.so
shared library in
the local directory.
% cc hello.c -o hello
% atom hello ptrace.inst.c ptrace.anal.c -o hello.ptrace -all
% setenv LD_LIBRARY_PATH `pwd`
% hello.ptrace
__start
_call_add_gp_range
__exc_add_gp_range
malloc
cartesian_alloc
cartesian_growheap2
__getpagesize
__sbrk
.
.
.
The call-shared version of the application calls almost twice the number of procedures that the nonshared version calls.
Note that only calls in the original application program are instrumented.
Because the call to the
ProcTrace
analysis procedure did
not occur in the original application, it does not appear in a trace of the
instrumented application procedures.
Likewise, the standard library calls
that print the names of each procedure are also not included.
If the application
and the analysis program both call the
printf
function,
Atom would link into the instrumented application two copies of the function.
Only the copy in the application program would be instrumented.
Atom also
correctly instruments procedures that have multiple entry points.
9.2.8.2 Profile Tool
The
iprof
example tool counts the number of instructions
a program executes.
It is useful for finding critical sections of code.
Each
time the application is executed,
iprof
creates a file
called
iprof.out
that contains a profile of the number
of instructions that are executed in each procedure and the number of times
each procedure is called.
The most efficient place to compute instruction counts is inside each
basic block.
Each time a basic block is executed, a fixed number of instructions
are executed.
The following example shows how the
iprof
tool's instrumentation procedure (iprof.inst.c) performs
these tasks:
1 #include <stdio.h>
2 #include <cmplrs/atom.inst.h>
3 static int n = 0;
4
5 static const char * SafeProcName(Proc *);
6
7 void InstrumentInit(int argc, char **argv)
8{
9 AddCallProto("OpenFile(int)"); [1]
10 AddCallProto("ProcedureCalls(int)");
11 AddCallProto("ProcedureCount(int,int)");
12 AddCallProto("ProcedurePrint(int,char*)");
13 AddCallProto("CloseFile()");
14 AddCallProgram(ProgramAfter,"CloseFile"); [2]
15 }
16
17 Instrument(int argc, char **argv, Obj *obj)
18 {
19 Proc *p; Block *b;
20
21 for (p = GetFirstObjProc(obj); p != NULL; p = GetNextProc(p)) { [3]
22 AddCallProc(p,ProcBefore,"ProcedureCalls",n);
23 for (b = GetFirstBlock(p); b != NULL; b = GetNextBlock(b)) { [4]
24 AddCallBlock(b,BlockBefore,"ProcedureCount", [5]
25 n,GetBlockInfo(b,BlockNumberInsts));
26 }
27 AddCallObj(obj, ObjAfter,"ProcedurePrint",n,SafeProcName(p)); [6]
28 n++; [7]
29 }
30 }
31
32 void InstrumentFini(void)
33 {
34 AddCallProgram(ProgramBefore,"OpenFile",n); [8]
35 }
36
37 static const char *SafeProcName(Proc *p)
38 {
39 const char * name;
40 static char buf[128];
41
42 name = ProcName(p); [9]
43 if (name)
44 return(name);
45 sprintf(buf, "proc_at_0x%lx", ProcPC(p));
46 return(buf);
47 }
Defines the interface to the analysis procedures. [Return to example]
Adds a call to the
CloseFile
analysis procedure to the end of the program.
[Return to example]
Loops through each procedure in the object. [Return to example]
Loops through each basic block in the procedure. [Return to example]
Adds a call to the
ProcedureCount
analysis procedure before any of the instructions in this basic
block are executed.
The argument types of the
ProcedureCount
are defined in the prototype on line 11.
The first argument is a procedure
index of type
int; the second argument, also an
int, is the number of instructions in the basic block.
The
ProcedureCount
analysis procedure adds the number of instructions
in the basic block to a per-procedure data structure.
Similarly, the
ProcedureCalls
analysis procedure increments a procedure's call
count before each call begins executing the called procedure.
[Return to example]
Adds a call to the
ProcedurePrint
analysis procedure to the end of the program.
The
ProcedurePrint
analysis procedure prints a line summarizing this procedure's instruction
use and call count.
[Return to example]
Increments the procedure index. [Return to example]
Adds a call to the
OpenFile
analysis procedure to the beginning of the program, passing it an
int
representing the number of procedures in the application.
The
OpenFile
procedure allocates the per-procedure data structure that
tallies instructions and opens the output file.
[Return to example]
Determines the procedure name. [Return to example]
The analysis procedures used by the
iprof
tool are
defined in the
iprof.anal.c
file as shown in the following
example:
1 #include <stdio.h>
2 #include <string.h>
3 #include <stdlib.h>
4 #include <unistd.h>
5
6 long instrTotal = 0;
7 long *instrPerProc;
8 long *callsPerProc;
9
10 FILE *OpenUnique(char *fileName, char *type)
11 {
12 FILE *file;
13 char Name[200];
14
15 if (getenv("ATOMUNIQUE") != NULL)
16 sprintf(Name,"%s.%d",fileName,getpid());
17 else
18 strcpy(Name,fileName);
19
20 file = fopen(Name,type);
21 if (file == NULL)
22 {
23 fprintf(stderr,"Atom: can't open %s for %s\n",Name, type);
24 exit(1);
25 }
26 return(file);
27 }
28
29 static FILE *file;
30 void OpenFile(int number)
31 {
32 file = OpenUnique("iprof.out","w");
33 fprintf(file,"%30s %15s %15s %12s\n","Procedure","Calls",
34 "Instructions","Percentage");
35 instrPerProc = (long *) calloc(sizeof(long), number); [1]
36 callsPerProc = (long *) calloc(sizeof(long), number);
37 if (instrPerProc == NULL || callsPerProc == NULL) {
38 fprintf(stderr,"Malloc failed\n");
39 exit(1);
40 }
41 }
42
43 void ProcedureCalls(int number)
44 {
45 callsPerProc[number]++;
46 }
47
48 void ProcedureCount(int number, int instructions)
49 {
50 instrTotal += instructions;
51 instrPerProc[number] += instructions;
52 }
53
54
55 void ProcedurePrint(int number, char *name)
56 {
57 if (instrPerProc[number] > 0) { [2]
58 fprintf(file,"%30s %15ld %15ld %12.3f\n",
59 name, callsPerProc[number], instrPerProc[number],
60 100.0 * instrPerProc[number] / instrTotal);
61 }
62 }
63
64 void CloseFile() [3]
65 {
66 fprintf(file,"\n%30s %15s %15ld\n", "Total", "", instrTotal);
67 fclose(file);
68 }
Allocates the counts data structure.
The
calloc
function zero-fills the counts data.
[Return to example]
Filters procedures that are never called. [Return to example]
Closes the output file. Tools must explicitly close files that are opened in the analysis procedures. [Return to example]
After the instrumentation and analysis files are specified, the tool
is complete.
To demonstrate the application of this tool, compile and link
the
"Hello"
application as follows:
#include <stdio.h>
main()
{
printf("Hello world!\n");
}
The following example builds a call-shared executable, applies the
iprof
tool, and runs the instrumented executable.
In contrast to
the
ptrace
tool described in
Section 9.2.8.1,
the
iprof
tool sends its output to a file instead of
stdout.
% cc hello.c -o hello % atom hello iprof.inst.c iprof.anal.c -o hello.iprof -all % setenv LD_LIBRARY_PATH `pwd` % hello.iprof Hello world! % more iprof.out Procedure Calls Instructions Percentage __start 1 92 1.487 main 1 15 0.242 . . . printf 1 81 0.926 . . . Total 8750 % unsetenv LD_LIBRARY_PATH
9.2.8.3 Data Cache Simulation Tool
Instruction and data address tracing has been used for many years as a technique to capture and analyze cache behavior. Unfortunately, current machine speeds make this increasingly difficult. For example, the Alvinn SPEC92 benchmark executes 961,082,150 loads, 260,196,942 stores, and 73,687,356 basic blocks, for a total of 2,603,010,614 Alpha instructions. Storing the address of each basic block and the effective address of all the loads and stores would take in excess of 10 GB and slow down the application by a factor of over 100.
The
cache
tool uses on-the-fly simulation to determine
the cache miss rates of an application running in an 8-KB, direct-mapped cache.
The following example shows its instrumentation routine:
1 #include <stdio.h>
2 #include <cmplrs/atom.inst.h>
3
4 unsigned InstrumentAll(int argc, char **argv)
5 {
6 Obj *o; Proc *p; Block *b; Inst *i;
7
8 AddCallProto("Reference(VALUE)");
9 AddCallProto("Print()");
10 for (o = GetFirstObj(); o != NULL; o = GetNextObj(o)) {
11 if (BuildObj(o)) return (1);
12 for (p=GetFirstProc(); p != NULL; p = GetNextProc(p)) {
13 for (b = GetFirstBlock(p); b != NULL; b = GetNextBlock(b)) {
14 for (i = GetFirstInst(b); i != NULL; i = GetNextInst(i)) { [1]
15 if (IsInstType(i,InstTypeLoad) || IsInstType(i,InstTypeStore)) {
16 AddCallInst(i,InstBefore,"Reference",EffAddrValue); [2]
17 }
18 }
19 }
20 }
21 WriteObj(o);
22 }
23 AddCallProgram(ProgramAfter,"Print");
24 return (0);
25 }
Examines each instruction in the current basic block. [Return to example]
If the instruction is a load or a store,
adds a call to the
Reference
analysis procedure, passing
the effective address of the data reference.
[Return to example]
The analysis procedures used by the
cache
tool are
defined in the
cache.anal.c
file as shown in the following
example:
1 #include <stdio.h>
2 #include <assert.h>
3 #define CACHE_SIZE 8192
4 #define BLOCK_SHIFT 5
5 long tags[CACHE_SIZE >> BLOCK_SHIFT];
6 long references, misses;
7
8 void Reference(long address) {
9 int index = (address & (CACHE_SIZE-1)) >> BLOCK_SHIFT;
10 long tag = address >> BLOCK_SHIFT;
11 if tags[index] != tag) {
12 misses++;
13 tags[index] = tag;
14 }
15 references++;
16 }
17 void Print() {
18 FILE *file = fopen("cache.out","w");
19 assert(file != NULL);
20 fprintf(file,"References: %ld\n", references);
21 fprintf(file,"Cache Misses: %ld\n", misses);
22 fprintf(file,"Cache Miss Rate: %f\n", (100.0 * misses) / references);
23 fclose(file);
24 }
After the instrumentation and analysis files are specified, the tool
is complete.
To demonstrate the application of this tool, compile and link
the
"Hello"
application as follows:
#include <stdio.h>
main()
{
printf("Hello world!\n");
}
The following example applies the
cache
tool to instrument
both the nonshared and call-shared versions of the application:
% cc hello.c -o hello % atom hello cache.inst.c cache.anal.c -o hello.cache -all % setenv LD_LIBRARY_PATH `pwd` % hello.cache Hello world! % more cache.out References: 1091 Cache Misses: 225 Cache Miss Rate: 20.623281 % cc -non_shared hello.c -o hello % atom hello cache.inst.c cache.anal.c -o hello.cache -all % hello.cache Hello world! % more cache.out References: 382 Cache Misses: 93 Cache Miss Rate: 24.345550