hp.com home products and services support and drivers solutions how to buy
cd-rom home
End of Jump to page title
HP OpenVMS systems
documentation

Jump to content


HP OpenVMS MACRO Compiler Porting and User's Guide

HP OpenVMS MACRO Compiler
Porting and User's Guide


Previous Contents Index

Note

If the code is being locked because the IPL will be raised above 2, where page faults cannot occur, make sure that the delimited code does not call run-time library routines or other procedures. The VAX MACRO compiler generates calls to routines to emulate certain VAX instructions. An image that uses these macros must link against the system base image so that references to these routines are resolved by code in a nonpageable executive image.

Image Initalization-Time Lockdown

For image initialization-time lockdown, three macros are used:

The macros $LOCKED_PAGE_START and $LOCKED_PAGE_END mark the beginning and end of a code segment which may be locked. The code delineated by these macros must contain complete routines---execution cannot fall through either macro, nor can the locked code be branched into or out of. Any attempt to branch into or out of the locked code section, or to fall through the macros will be flagged by the compiler with the following error message:


%AMAC-E-MULTLKSEC, Routines which share code must use the same linkage psect. 

$LOCKED_PAGE_END has an optional parameter, LINK_SECT, which is used to specify the linkage psect to return to after the routine is executed. It is only used if the linkage psect in effect when the $LOCKED_PAGE_START macro was executed was not the default linkage psect, $LINKAGE.

The macro $LOCK_PAGE_INIT must be executed in the initialization routines of an image which is using $LOCKED_PAGE_START and $LOCKED_PAGE_END to delineate areas to be locked. It creates the necessary psects and issues the $LKWSET calls to lock the code and linkage sections into the working set. R0 and R1 are destroyed by this macro.

$LOCK_PAGE_INIT has an optional parameter, ERROR, which is an error address to which to branch if one of the $LKWSET calls fail. If this address is reached, R0 reflects the status of the failed call, and R1 contains 0 if the call to lock the code failed, or 1 if that call succeeded but the call to lock the linkage section failed.

Note that since psects are used to identify code to be locked, the $LOCK_PAGE_INIT macro need not be in the same module as the code delineated by the $LOCKED_PAGE_START and $LOCKED_PAGE_END macros. The invocation of $LOCK_PAGE_INIT locks all delineated code in the entire image.

Table 3-1 shows the code changes necessary for using these macros. The delineating labels are replaced by the $LOCKED_PAGE_START and $LOCKED_PAGE_END macros. The descriptor is eliminated, and the $LKWSET call in the initialization code is replaced by $LOCK_PAGE_INIT.

Table 3-1 Image Initialization-Time Lockdown
Code Section On OpenVMS VAX Systems On OpenVMS Alpha Systems
Data declaration
LOCK_DESCRIPTOR:

.ADDRESS LOCK_START
.ADDRESS LOCK_END


Nothing. Eliminate the descriptor altogether.
Initialization
 $LKWSET_S LOCK_DESCRIPTOR

BLBC R0,ERROR
 $LOCK_PAGE_INIT ERROR

Main code
LOCK_START:

Routine_A:
.
.
.
RSB
LOCK_END:
 $LOCKED_PAGE_START

Routine_A:
.
.
.
RSB
$LOCKED_PAGE_END

Locking Code Written in Other Languages

Code written in other programming languages can also be locked down by using the $LOCK_PAGE_INIT macro in a VAX MACRO module. Any code in any module written in any language will be locked by this macro if the psect $LOCK_PAGE_2 is used for the generated code and the psect $LOCK_LINKAGE_2 is used for the generated linkage section.

On-the-Fly Lockdown

For on-the-fly lockdown, $LOCK_PAGE and $UNLOCK_PAGE, respectively, mark the beginning and end of a section of code to be locked. The marked code becomes a separate routine in the locked psect, where all code locked anywhere in the image is placed.

$LOCK_PAGE locks the pages and linkage section of the locked routine into the working set and JSRs to it. This macro is placed inline in executable code. All code between this macro and the matching $UNLOCK_PAGE macro is included in the locked routine and is locked down.

$UNLOCK_PAGE returns from the locked routine and then unlocks the pages and linkage section from the working set. The macro is placed inline in executable code at some point after a $LOCK_PAGE macro.

$LOCK_PAGE and $UNLOCK_PAGE both have an optional parameter, ERROR, which is an error address to which to branch if the $LKWSET or $ULWSET calls fail. $UNLOCK_PAGE has a second optional parameter, LINK_SECT. LINK_SECT is a linkage psect to which to return if the linkage psect in effect when the $LOCK_PAGE macro was executed was not the default linkage psect, $LINKAGE.

All registers are preserved by both macros unless the error address parameter is present and one of the calls fail, in which case R0 reflects the status of the failed call. R1 then contains 0 if the call to lock or unlock the code failed, and 1 if that call succeeded but the call to lock or unlock the linkage section failed.

Control must enter the code through the $LOCK_PAGE macro, and must leave through the $UNLOCK_PAGE macro. The local symbol block that is in effect when the $LOCK_PAGE macro is executed is restored when the $UNLOCK_PAGE macro is executed, but since the locked code becomes a separate routine, the locked code itself is a separate local symbol block. Even if named symbols are used, branches into or out of the locked code section are not allowed, and will be flagged by the compiler with the following error:


%AMAC-E-MULTLKSEC, Routines which share code must use the same linkage psect. 

Note that since the locked code is made into a separate routine, any references to local stack storage within the routine will have to be changed, as the stack context is no longer the same.

Note

Because on-the-fly lockdown requires the overhead of four system service calls plus an extra subroutine call every time it is executed, it is recommended that this be changed to initialization-time lockdown if the lockdown is done for any performance-critical code. If other routines in the image use initialization-time lockdown, then you must change the on-the-fly lockdown to initialization-time lockdown.

Table 3-2 shows the code changes required to use these macros for on-the-fly lockdown. Note that the $UNLOCK_PAGE macro precedes the RSB, so that it is executed. Any status being passed by the routine in R0 and R1 remains intact because $UNLOCK_PAGE preserves these registers.

Table 3-2 On-the-Fly Lockdown
Code Section On VAX Systems On Alpha Systems
Main code
Routine_A:

.
.
.
SETIPL 100$
.
.
.
RSB
100$: .LONG IPL$SYNCH
Routine_A:

.JSB_ENTRY
.
.
.
$LOCK_PAGE
.
.
.
$UNLOCK_PAGE
RSB

Table 3-3 shows the same original code and the changes necessary for initialization-time lockdown.

Table 3-3 Image Initialization-Time Lockdown with the Same Code
Code Section On VAX Systems On Alpha Systems
Initialization

Nothing.
 $LOCK_PAGE_INIT

Main code
Routine_A:

.
.
.
SETIPL 100$
.
.
.
RSB
100$: .LONG IPL$SYNCH
 $LOCKED_PAGE_START

Routine_A:
.JSB_ENTRY
.
.
.
RSB
$LOCKED_PAGE_END

3.11 Synchronization

The following information about synchronization is relevant when porting code from OpenVMS VAX to OpenVMS Alpha or OpenVMS I64 systems:


Chapter 4
Improving the Performance of Ported Code

This chapter describes how you can improve the performance of your ported code.

This chapter contains the following topics:

4.1 Aligning Data

An unaligned data reference will work but will be slow on OpenVMS Alpha or OpenVMS I64, because the system must take an unaligned address fault to complete the unaligned reference. If it is known that a data reference is unaligned, the compiler can generate unaligned quadword loads and masks to manually extract the data. This is slower than an aligned load but much faster than taking an alignment fault. Global data labels that are not longword or quadword aligned are flagged with information-level messages.

In addition, unaligned memory modification references cannot be made atomic with /PRESERVE=ATOMICITY or .PRESERVE ATOMICITY. If this is attempted, it will cause a fatal reserved operand fault.

4.1.1 Alignment Assumptions

By default, the compiler assumes the following:

Every time a register is changed, the compiler determines whether the base address in the register is still aligned. If the register and specified offset result in an aligned address, the compiler uses an aligned load or store for a memory reference. The compiler attempts to track register usage in terms of whether the base address remains aligned. When a stored memory address is loaded, for instance, MOVL 4(R1),R0, or used indirectly for instance, MOVL @4(R1),R0, the compiler assumes the resulting address is aligned.

For quadword memory references such as MOVQ instructions, the compiler assumes the base address is quadword aligned, unless it has determined by means of its register tracking code that the address may not be longword aligned. In other words, quadword register alignment is not tracked---only longword alignment.

Quadword references in OpenVMS Alpha or OpenVMS I64 built-ins, such as those in the following example, will be in new code, where alignment should be correct. Therefore all memory references in the following example will use aligned quadword load/stores:


EVAX_LDQ  R1, (R2) 
EVAX_ADDQ  (R1), #1, (R3) 

If an OpenVMS Alpha or OpenVMS I64 built-in (other than EVAX_LDQU or EVAX_STQU) is used on an address that is not quadword aligned, an alignment fault will occur at run time.

4.1.2 Directives and Qualifier for Changing Alignment Assumptions

The compiler provides two directives and one qualifier for changing the compiler's alignment assumptions. Both directives enable the compiler to produce more efficient code. The .SET_REGISTERS directive allows you to specify whether a register is aligned or unaligned. This directive should be used when the result of an operation is the reverse of what the compiler expects. It also allows you to declare registers that the compiler would not otherwise detect as input or output registers.

The .SYMBOL_ALIGNMENT directive allows you to specify the alignment of any memory reference that uses a symbolic offset. This directive should be used when you know the data will be aligned for every use of the symbolic offset.

These directives are described in Appendix B. The examples in each description show how to use them.

The /UNALIGN qualifier to the MACRO/MIGRATION command tells the compiler to assume unaligned all the time for all register-based memory references rather than try to track the alignment. This does not affect stack-based or static references where the compiler knows the alignment.

This qualifier is described in Appendix A.

4.1.3 Precedence of Alignment Controls

The order of precedence of the compiler's alignment controls, from strongest (.SYMBOL_ALIGNMENT) to weakest (built-in assumptions and tracking mechanisms), follows:

  1. .SYMBOL_ALIGNMENT directive
  2. .SET_REGISTER directive
  3. /UNALIGN qualifier
  4. Built-in assumptions and tracking mechanisms

4.1.4 Recommendations for Aligning Data

The following recommendations are provided for aligning data:

4.2 Code Flow and Branch Prediction

The Alpha and Itanium architectures are pipelined, which means that before completing the current instruction, they start to execute several instructions beyond it. By tailoring the code to keep the pipeline filled, you can make the code run significantly faster.

On each conditional branch, the Alpha and Itanium architectures attempt to predict whether or not the branch is taken so that they can correctly fill the instruction pipeline with the next instruction to be executed. The Alpha architecture predicts that forward conditional branches will not be taken and backward conditional branches will be taken. The Itanium architecture has branch-prediction hints in the branch instructions. A mispredicted branch costs extra time because the pipeline must be flushed, and, in addition, the instruction at the branch destination may not be in the instruction cache.

The compiler tries to follow the flow of the VAX MACRO code to generate Alpha and Itanium code that has the most common code path in a contiguous block, to allow the pipelined Alpha and Itanium architecture to process the code with the greatest efficiency. However, in some situations, the compiler's default rules do not generate the most efficient code. In performance sensitive code sections, you can often improve the efficiency of the generated code by giving the compiler information about which code paths will most likely be taken.


Previous Next Contents Index