From: SMTP%"jnelson@gauche.zko.dec.com" 23-SEP-1993 16:35:09.82 To: EVERHART CC: Subj: Re: Delineating C Functions in VMS Object Files From: jnelson@gauche.zko.dec.com (Jeff E. Nelson) X-Newsgroups: comp.os.vms Subject: Re: Delineating C Functions in VMS Object Files Date: 22 Sep 1993 13:53:36 GMT Organization: Digital Equipment Corporation Lines: 115 Distribution: world Message-ID: <27pld0$eve@usenet.pa.dec.com> Reply-To: jnelson@gauche.zko.dec.com (Jeff E. Nelson) NNTP-Posting-Host: gauche.zko.dec.com Summary: it's not easy, nor is it complete X-Newsreader: mxrn 6.18-5 To: Info-VAX@KL.SRI.COM X-Gateway-Source-Info: USENET In article <278jhm$4ha@skates.gsfc.nasa.gov>, alex@tpocc.gsfc.nasa.gov (Alex Measday) writes: |>I've got a program, OFLOW, that generates CFLOW-style, textual structure |>charts by extracting symbol information from VMS object files (C object |>files, in this case). OFLOW reads an object file and |> |> (1) Extracts the function name (i.e., the "caller") from the |> the module header record (MHD$C_MHD). |> |> (2) Extracts global symbol references (TIR$C_STA_GBL commands) |> from the text information and relocation records (OBJ$C_TIR). |> These global symbols are the "callees"; i.e., the functions |> called by the function defined in (1). |> |>OFLOW was derived from a similar program I wrote years ago that was |>mostly used on FORTRAN object files. I vaguely recollect that the scheme |>described above worked well with FORTRAN object files: each subroutine |>defined in the source file appeared as a separate module in the object |>file. |> |>OFLOW works well if the C source files have one function defined per |>file, but not very well if there are multiple functions defined in a |>source file or if the external file name doesn't match the internal |>function name. The external file name appears to be used as the module |>name in the module header record and a C object file only contains one |>module, regardless of how many functions are defined in the source file. |> |>QUESTION: Is there any way I can tell where the code for a function |>starts and ends in a C object file? Assume no optimization; I realize |>(I discovered!) the optimizer can eliminate whole routines via inlining. |> |>I have pored over ANALYZE/OBJECT reports, but nothing obvious has |>jumped out at me. Is there something in the debugger information |>I could use and, if so, how do I decode the debugger information? DEBUG records are a possibility, but this approach is not easy. In an .OBJ file, the debug information is recorded in OBJ$C_TBT and OBJ$C_DBG records, and information in each of these is further coded into individual TIR$C_xxx and "Store Immediate" commands. You'd have to understand the LINKER object language first, and then understand the format of the DEBUG symbol table records. For example, here is part of the output of an ANALYZE/OBJ which contains the DEBUG record to define a routine named "thread_action": 12. TRACEBACK INFORMATION (OBJ$C_TBT), 274 bytes 1) Store Immediate, 10 bytes: 7 6 5 4 3 2 1 0 01234567 ------------------------ -------- 14 00 00 01 94 00 BF 06| 0000 |.¿......| 00 BE| 0008 |š. | 2) TIR$C_STA_PL (6, %X'06') stack depth: 1 psect: 0 value: 404 (%X'00000194') 3) TIR$C_STO_PIDR (27, %X'1B') stack depth: 0 4) Store Immediate, 17 bytes: 7 6 5 4 3 2 1 0 01234567 ------------------------ -------- 5F 64 61 65 72 68 74 0D| 0000 |.thread_| B0 16 6E 6F 69 74 63 61| 0008 |action.°| 00| 0010 |. | ...more OBJ$C_TBT stuff... A second approach is to look for the actual OBJ records which defines routines. For example, here is the information which defines the C main() routine: 45. GLOBAL SYMBOL DIRECTORY (OBJ$C_GSD), 370 bytes ...intervening GSD records omitted... 13) Entry Point Symbol and Mask Definition (GSD$C_EPM) data type: DSC$K_DTYPE_Z (0) symbol flags: (0) GSY$V_WEAK 0 (1) GSY$V_DEF 1 (2) GSY$V_UNI 0 (3) GSY$V_REL 1 (4) GSY$V_COMM 0 psect: 0 value: 0 (%X'00000000') entry mask: symbol: "MAIN" ...more GSD records... You can tell that this is a routine because of the symbol type (GSD$C_EPM). You can tell that this is a symbol definition as opposed to a reference because the GSY$V_DEF bit is set. There is one unfortunate drawback with the second approach: you won't find any information about non-global (i.e., local) routines. If you think about it, this makes sense, since the linker is used to resolve external references; all of the internal references are already resolved by the compiler. Depending upon your needs, this second approach may in fact be what you're looking for. For more information, you should take a good look at the Linker Reference Manual. One entire chapter is devoted to the Linker's Object Language. You need to be very familiar with it should you chose either approach. Oh yes, one more thing: the linker's object language is different on the OpenVMS AXP platform. Not radically different, but different enough. The DEBUG symbol table records are also different, too. The above examples are taken from the OpenVMS VAX platform. Hope this helps. -Jeff E. Nelson -Digital Equipment Corporation -Internet: jnelson@gauche.zko.dec.com -Affiliation given for identification purposes only -Not an official statement of Digital Equipment Corporation