HP OpenVMS MACRO Compiler Porting and User's Guide

HP OpenVMS MACRO Compiler
Porting and User's Guide

Contents

Index

2.11.5 Alignment Considerations for Atomicity

When preserving atomicity, the compiler must assume the modify data is aligned. An update of a field spanning a quadword boundary cannot occur atomically since this would require two read-modify-write sequences.

On OpenVMS Alpha systems, since software cannot handle an unaligned LDx_L or STx_C instruction as it can a normal load or store instruction, a LDx_L or STx_C instruction to an unaligned address will generate a fatal reserved operand fault.

On OpenVMS I64 systems, since software cannot handle an unaligned address in the compare-exchange (cmpxchg) instruction, it will generate an exception at run time.

On OpenVMS Alpha systems, when /PRESERVE=ATOMICITY (or .PRESERVE ATOMICITY) is specified, an INCL (R1) instruction generates LDL_L and STL_C instructions so R1 must be longword aligned.

Assume the following instruction:

INCW (R1)

For this instruction, the compiler generates a code sequence such as the following on OpenVMS Alpha systems:

BIC R1,#^B0110,R28 ; Compute Aligned Address Retry: LDQ_L R24,(R28) ; Load the QW with the data EXTWL R24,R1,R23 ; Extract out the Word ADDL R23,#1,R23 ; Increment the Word INSWL R23,R1,R23 ; Correctly position the Word MSKWL R24,R1,R24 ; Zero the spot for the Word BIS R23,R24,R23 ; Combine Original and New word STQ_C R23,(R28) ; Conditionally store result BEQ fail ; Branch ahead on failure . . . fail: BR Retry

Note that the first BIC instruction uses #^B0110, not #^B0111. This is to ensure that the word does not cross a quadword boundary, which would result in an incomplete memory update. If the address in R1 is not pointing to an aligned word, bit 0 will be set and the bit will not be cleared by the BIC instruction. The Load Quadword Locked instruction (LDQ_L) will then generate a fatal reserved operand fault.

An INCB instruction uses #^B0111 to generate the aligned address since all bytes are aligned.

For the INCW (R1) instruction, the compiler generates a code sequence such as the following on OpenVMS I64 systems:

$L5: ld2 r19 = [r9] mov.m apccv = r19 mov r18 = r19 sxt2 r19 = r19 adds r19 = 1, r19 cmpxchg2.acq r19, [r9] = r19 cmp.eq pr0, pr8 = r18, r19 (pr8) br.cond.dpnt.few $L5

2.11.6 Interlocked Instructions and Atomicity

The compiler's methods of preserving atomicity have an interesting side effect in compiled VAX MACRO code.

On OpenVMS VAX systems, only the interlocked instructions will work correctly to synchronize access to shared data in multiprocessor systems. On OpenVMS Alpha multiprocessing systems, the code resulting from a compilation of modify instructions (with atomicity preserved) and interlocked instructions would both work correctly, because the LDx_L and STx_C which the compiler generates for both sets of instructions operate correctly across multiple processors. Likewise, on OpenVMS I64 systems, the the compare-exchange (cmpxchg) instruction provides interlocking across processors.

Because this compiler side effect is specific to OpenVMS Alpha and OpenVMS I64 systems and does not port back to OpenVMS VAX systems, you should avoid relying on it when porting VAX MACRO code to OpenVMS Alpha or OpenVMS I64 if you intend to run the code on both systems.

However, interlocked instructions must still be used if the memory modification is being used as an interlock for other instructions for which atomicity is not preserved. This is because the Alpha and and Itanium architectures do not guarantee strict write ordering.

For example, consider the following VAX MACRO code sequence:

.PRESERVE ATOMICITY INCL (R1) .NOPRESERVE ATOMICITY MOVL (R2),R3

This code sequence will generate the following Alpha code sequence:

Retry: LDL_L R28,(R1) ADDL R28,#1,R28 STL_C R28,(R1) BEQ R28, fail LDL R3, (R2) . . . fail: BR Retry

Because of the data prefetching of the Alpha and Itanium architectures, the data from (R2) may be read before the store to (R1) is processed. If the INCL (R1) instruction is being used as a lock to prevent the data at (R2) from being accessed before the lock is set, the read of (R2) may occur before the increment of (R1) and thus is not protected.

The VAX interlocked instructions generate Alpha MB (memory barrier) or Itanium mf (memory fence) instructions before and after the interlocked sequence. This prevents memory loads from being moved across the interlocked instruction.

On OpenVMS I64, the code sequence would be similar to the following:

$L7: ld4 r16 = [r9] mov.m apccv = r16 mov r15 = r16 sxt4 r16 = r16 adds r16 = 1, r16 cmpxchg4.acq r16, [r9] = r16 cmp.eq pr0, pr10 = r15, r16 (pr10) br.cond.dpnt.few $L7 ld4 r3 = [r28] sxt4 r3 = r3

Consider the following code sequence:

ADAWI #1,(R1) MOVL (R2),R3

This code sequence will generate the following Alpha code sequence:

MB Retry: LDL_L R28,(R1) ADDL R28,#1,R28 STL_C R28,(R1) BEQ R28, Fail MB LDL R3, (R2) . . . Fail: BR Retry

On OpenVMS I64, a code sequence similar to the following would be generated:

mf $L8: ld2 r23 = [r9] mov.m apccv = r23 adds r24 = 1, r23 cmpxchg2.acq r14, [r9] = r24 cmp.eq pr0, pr11 = r23, r14 (pr11) br.cond.dpnt.few $L8 mf ld4 r3 = [r28] sxt4 r3 = r3

The MB or mf instructions cause all memory operations before the MB or mf instruction to complete before any memory operations after the MB or mf instruction are allowed to begin.

2.12 Compiling and Linking

The compiler requires the following files:

SYS$LIBRARY:STARLET.MLB
This is a macro library that defines the compiler directives. When you compile your code, the compiler automatically checks STARLET.MLB for definitions of compiler directives.
SYS$LIBRARY:STARLET.OLB
This is an object library containing emulation routines and other routines used by the compiler. When you link your code, the linker links against STARLET.OLB to resolve undefined symbols.

For information about compiler qualifiers, see Appendix A.

2.12.1 Line Numbering in Listing File

The macro expansion line numbering scheme in the listing file is Xnn/mmm, where Xnn shows the nesting depth and mmm is the line number relative to the outermost macro.

Example 2-1 shows an OpenVMS I64 listing file. The source portion of an OpenVMS Alpha listing file is essentially the same.

Example 2-1 Example of Line Numbering in an OpenVMS I64 Listing File

00000000 1 ; 00000000 2 ; This is the Itanium (previously called "IA-64") version of 00000000 3 ; ARCH_DEFS.MAR, which contains architectural definitions for 00000000 4 ; compiling VMS sources for VAX, Alpha, and I64 systems. 00000000 5 ; 00000000 6 ; Note: VAX, VAXPAGE, and IA64 should be left undefined, 00000000 7 ; a lot of code checks for whether a symbol is 00000000 8 ; defined (e.g. .IF DF VAX) vs. whether the value 00000000 9 ; is of a expected value (e.g. .IF NE VAX). 00000000 10 ; 00000000 11 ;VAX = 0 00000000 12 ;EVAX = 0 00000000 13 ;ALPHA = 0 00000001 00000000 14 IA64 = 1 00000000 15 ; 00000000 16 ;VAXPAGE = 0 00000001 00000000 17 BIGPAGE = 1 00000000 18 ; 00000020 00000000 19 ADDRESSBITS = 32 00000000 20 .TITLE ug_ex_listing /line numbering in the listing file/ 00000000 21 ; 00000000 22 .MACRO test1 00000000 23 clrl r1 00000000 24 clrl r2 00000000 25 tstl 48(sp) ; generate uplevel stack error 00000000 26 clrl r3 00000000 27 .ENDM test1 00000000 28 .MACRO test2 00000000 29 clrl r4 00000000 30 clrl r5 00000000 31 test1 00000000 32 clrl r6 00000000 33 .ENDM test2 00000000 34 00000000 35 foo: .jsb_entry 00000000 56 .show expansions 00000000 57 clrl r0 00000011 58 test2 1....... %IMAC-E-UPLEVSTK, (1) up-level stack reference in routine FOO X01/001 00000002 clrl r4 X01/002 00000004 clrl r5 X01/003 00000006 test1 X02/004 00000006 clrl r1 X02/005 00000008 clrl r2 X02/006 0000000A tstl 48(sp) ; generate uplevel stack error X02/007 0000000D clrl r3 X02/008 0000000F X01/009 0000000F clrl r6 X01/010 00000011 00000011 59 rsb 00000012 60 .noshow expansions 00000012 61 00000012 62 .END

2.13 Debugging

The compiler provides full debugger support. The debug session for compiled VAX MACRO code is similar to that for assembled VAX MACRO code. However, there are some important differences that are described in this section. For a complete description of debugging, see the HP OpenVMS Debugger Manual.

2.13.1 Code Relocation

One major difference is that the code is compiled rather than assembled. On an OpenVMS VAX system, each VAX MACRO instruction is a single machine instruction. On an OpenVMS Alpha or OpenVMS I64 system, each VAX MACRO instruction may be compiled into many Alpha or Itanium machine instructions. A major side effect of this difference is the relocation and rescheduling of code if you do not specify /NOOPTIMIZE in your compile command.

By default, several optimizations are performed that cause the movement of generated code across source boundaries (see Section 1.2, Section 4.3, and Appendix A). For most code modules, debugging is simplified if you compile with /NOOPTIMIZE, which prevents this relocation from happening. After you have debugged your code, you can recompile without /NOOPTIMIZE to improve performance.

2.13.2 Symbolic Variables for Routine Arguments

Another major difference between debugging compiled code and debugging assembled code is a new concept to VAX MACRO, the definition of symbolic variables for examining routine arguments. On OpenVMS VAX systems, when you are debugging a routine and want to examine the arguments, you typically do something like the following:

DBG> EXAMINE @AP ; to see the argument count DBG> EXAMINE @AP+4 ; to examine the first arg

DBG> EXAMINE @AP ; to see arg count DBG> EXAMINE .+4:.+20 ; to see first 5 args

On OpenVMS Alpha and OpenVMS I64 systems, the arguments do not reside in a vector in memory as they do on OpenVMS VAX systems. Furthermore, there is no AP register on OpenVMS Alpha and OpenVMS I64 systems. If you type EXAMINE @AP when debugging VAX MACRO compiled code, the debugger reports that AP is an undefined symbol.

In the compiled code, the arguments can reside in some combination of:

Registers
On the stack above the routine's stack frame
In the stack frame, if the argument list was homed (see Section 2.4) or if there are calls out of the routine that would require the register arguments to be saved

The compiler does not require that you figure out where the arguments are by reading the generated code. Instead, it provides $ARGn symbols that point to the correct argument locations. The $ARG0 symbol is the same as @AP+0 is on VAX systems, that is, the argument count. The $ARG1 symbol is the first argument, $ARG2 is the second argument, and so forth. These symbols are defined in CALL_ENTRY and JSB_ENTRY directives, but not in EXCEPTION_ENTRY directives.

2.13.3 Locating Arguments Without $ARGn Symbols

There may be additional arguments in your code for which the compiler did not generate a $ARGn symbol. The number of $ARGn symbols defined for a .CALL_ENTRY routine is the maximum number detected by the compiler (either by automatic detection or as specified by MAX_ARGS). For a .JSB_ENTRY routine, since the arguments are homed in the caller's stack frame and the compiler cannot detect the actual number, it always creates eight $ARGn symbols.

In most cases, you can easily find any additional arguments, but in some cases you cannot.

2.13.3.1 Additional Arguments That Are Easy to Locate

You can easily find additional arguments if:

The argument list is not homed, and $ARGn symbols are defined to $ARG7 or higher on OpenVMS Alpha and $ARG9 or higher on OpenVMS I64. If the argument list is not homed, the $ARGn symbols $ARG7 and above on OpenVMS Alpha and $ARG9 and above on OpenVMS I64 always point into the list of parameters passed as quadwords on the stack. Subsequent arguments will be in quadwords following the last defined $ARGn symbol.
The argument list is homed, and you want to examine an argument that is less than or equal to the maximum number detected by the compiler (either by automatic detection or as specified by MAX_ARGS). If the argument list is homed, $ARGn symbols always point into the homed argument list. Subsequent arguments will be in longwords following the last defined $ARGn symbol.

For example, you can examine arguments beyond the eighth argument in a JSB routine (where the argument list must be homed in the caller), as follows:

DBG> EX $ARG8 ; highest defined $ARGn . . . DBG> EX .+4 ; next arg is in next longword . . . DBG> EX .+4 ; and so on

This example assumes that the caller detected at least 10 arguments when homing the argument list.

To find arguments beyond the last $ARGn symbol in a routine that did not home the arguments, proceed exactly as in the previous example except substitute EX .+8 for EX .+4.

2.13.3.2 Additional Arguments That Are Not Easy to Locate

You cannot easily find additional arguments if:

The argument list is not homed, and $ARGn symbols are defined only as high as $ARG6 on OpenVMS Alpha and $ARG8 on OpenVMS I64. In this case, the existing $ARGn symbols will either point to registers or to quadword locations in the stack frame. In both cases, subsequent arguments cannot be examined by looking at quadword locations beyond the defined $ARGn symbols.
The argument list is homed, and you want to examine arguments beyond the number detected by the compiler. The $ARGn symbols point to the longwords that are stored in the homed argument list. The compiler only moves as many arguments as it can detect into this list. Examining longwords beyond the last argument that was homed will result in examining various other stack context.

The only way to find the additional arguments in these cases is to examine the compiled machine code to determine where the arguments reside. Both of these problems are eliminated if MAX_ARGS is specified correctly for the maximum argument that you want to examine.

2.13.4 Using VAX and Alpha Register Names on OpenVMS I64

For convenience, the MACRO compiler on OpenVMS I64 defines symbols named R0, R1, ... R31 to refer to the Itanium registers where those Alpha register values reside. You can still use the debugger's names %R0, %R1, ... %R31 to refer to registers by the native machine's numbering.

2.13.5 Debugging Code with Packed Decimal Data

Keep this information in mind when debugging compiled VAX MACRO code with packed decimal data on an OpenVMS Alpha or OpenVMS I64 system:

When using the EXAMINE command to examine a location that was declared with a .PACKED directive, the debugger automatically displays the value as a packed decimal data type.
You can deposit packed decimal data. The syntax is the same as it is on VAX.

2.13.6 Debugging Code with Floating-Point Data

Keep this information in mind when debugging compiled VAX MACRO code with floating-point data on an OpenVMS Alpha or OpenVMS I64 system:

You can use the EXAMINE/FLOAT command to examine an Alpha or Itanium integer register for a floating-point value.
Even though there is a set of registers for floating-point operations on OpenVMS Alpha and OpenVMS I64 systems, those registers are not used by compiled VAX MACRO code that contains floating-point operations. Only the integer registers are used.
Floating-point operations in compiled VAX MACRO code are performed by emulation routines that operate outside the compiler. Therefore, performing VAX MACRO floating-point operations on, say, R7, has no effect on floating-point Register 7.
When using the EXAMINE command to examine a location that was declared with a .FLOAT directive or other floating-point storage directives, the debugger automatically displays the value as floating-point data.

When using the EXAMINE command to examine the G_FLOAT data type, the debugger does not use the contents of two registers to build the value for VAX data.
Consider the following example:

EXAMINE/G_FLOAT R4

In this example, the lower longwords of R4 and R5 are not used to build the value as is the case on VAX. Instead, the quadword contents of R4 are used.
The code the compiler generates for D_FLOAT and G_FLOAT operations preserves the VAX format of the data in the low longwords of two consecutive registers. Therefore, using EXAMINE/G_FLOAT on either of these two registers will not give the true floating-point value, and issuing DEPOSIT/G_FLOAT to one of these registers will not give the desired results. You can manually combine the two halves of such a value, however. For example, assume you executed the following instruction:

MOVG DATA, R6

You could then read the G_FLOAT value which now resides in R6 and R7 with a sequence like the following:

DBG> EX R6 .MAIN.\%LINE 100\%R6: 0FFFFFFFF D8E640D1 DBG> EX R7 .MAIN.\%LINE 100\%R7: 00000000 2F1B24DD DBG> DEP R0 = 2F1B24DDD8E640D1 DBG> EX/G_FLOAT R0 .MAIN.\%LINE 100\%R0: 4568.89900000000

You can deposit floating-point data in an Alpha or Itanium integer register with the DEPOSIT command. The syntax is the same as it is on a VAX system.
H_FLOAT is unsupported.
On OpenVMS I64 systems, incoming parameters are in R32 through R39, not in R16 through R21. Outgoing parameters are in higher numbered registers chosen by the compiler.

Contents

Index