Compiler Conventions

In order to write assembly language procedures that can both call to and be called from high-level language procedures, it is necessary to understand the standard procedure-calling convention and other compiler conventions.

On many computer systems, each high-level language has its own calling convention. Consequently, calls from one language to another are sometimes difficult to arrange, except through assembly code. The architecture generally prescribes very few operations that must be done to effect a procedure call, and there is often a pair of machine-language instructions to call a procedure and return from one. PA-RISC architecture provides no special procedure call or return instructions.

There is, however, a standard procedure-calling convention for all high-level languages as well as the Assembler. It is tuned for the architecture, and is designed to make a procedure call with as few instructions as possible.

Besides defining a uniform call and return sequence for all languages, the calling convention is important for other reasons. In order to streamline the calling sequence, the return link is not saved on the stack unless necessary and the previous stack pointer is rarely saved on the stack. Therefore, it is not usually possible to obtain a stack trace at an arbitrary point in the program without some additional static information about each procedure's stack frame size and usage.

For example, you could not obtain a stack trace while debugging or analyzing a core dump, or using the TRY/RECOVER feature in HP Pascal/HP-UX. Obtaining a stack trace is made possible by the stack unwind mechanism. It uses special unwind descriptors that contain the exact static information needed for each procedure. These descriptors are generated automatically by the linker based on information provided by all high-level compilers as well as the Assembler.

Each descriptor contains the starting and ending address of a procedure's object code, plus that procedure's stack frame size, and a few flags indicating, among other things, whether the return link is saved on the stack. Given the current program counter and stack pointer, the stack unwind mechanism can determine the calling procedure by finding the return link either in a register or on the stack. Also, it can determine the previous stack pointer by subtracting the current procedure's stack frame size.

The Assembler requires that you follow programming conventions to generate unwind descriptors. The beginning and end of each procedure must be noted with the .PROC and .PROCEND directives. The .CALLINFO directive supplies additional information about the procedure, including the stack frame size. The Assembler passes this information to the linker, which creates the unwind descriptor. It can also generate the standard entry and exit code to create and destroy the stack frame, save and restore the return link (if necessary), and save and restore any necessary registers. These code sequences are generated at the points indicated by the .ENTER and .LEAVE pseudo-operations. For a more thorough discussion of programming conventions, refer to the 64-bit Runtime Architecture for PA-RISC 2.0, at URL: http://www.software.hp.com/STK/.

Arguments to procedures are loaded into general registers 26, 25, 24, and 23; these registers are named, respectively, %arg0, %arg1, %arg2, and %arg3. If more than four words of arguments are required, the remaining arguments are stored in the caller's stack frame in the variable argument list. The return value should be returned in general register 28, called %ret0. General register 29, called %ret1, is used for the low-order bits of a double-word return value, while %ret0 contains the high order bits. In addition to the argument and return registers, the procedure can use registers 19 through 22 and registers 1 and 31 as scratch registers. Any other general registers must be saved before use at entry and restored before exit.

Chapter 4 “Assembler Directives and Pseudo-Operations” contains detailed descriptions of the Assembler directives described above. For a more thorough discussion of the procedure calling conventions, refer to the topic PA-RISC Architecture at URL: http://www.software.hp.com/STK/.

In order for an assembly language procedure to be callable from another language or another assembly language module, the name of the procedure must be exported. The .EXPORT directive does this. It also allows you to declare the symbol type. For procedure entry points, the symbol type should be ENTRY.

The Assembler and linker treat all symbols as case-sensitive, while some compilers do not. By convention, compilers that are case-insensitive uniformly convert all exported names to lower case. For example, it is possible to declare a procedure that cannot conflict with HP Pascal/HP-UX procedure names by using uppercase letters. However, there is an aliasing mechanism in some compilers that allows you to declare a case-sensitive name for external use. See the appropriate language reference manual for more information.

Conversely, the .IMPORT directive allows you to reference a procedure name that is exported from another module, either from the Assembler or the compiler. Once a procedure name has been imported, it can be referenced exactly as if it were declared in the same module.

Data symbols can be exported and imported just like procedure names. However, not all compilers export the names of global variables, or provide a mechanism to reference data symbols exported from an assembly language module. For example, the HP Pascal/HP-UX compiler does not normally do this, while the HP C/HP-UX compiler does. HP FORTRAN 77/HP-UX named common blocks are exported, but the names of the variables within the common blocks are not.

It was mentioned before that data is allocated beginning from a virtual space offset 0x40000000. For convenience as well as compatibility with future releases of HP-UX systems, all data in the $PRIVATE$ space must be accessed relative to general register 27, called %dp. EStandard run-time start-up code, from the file /usr/ccs/lib/crt0.o, must be linked with every program. This start-up code declares a global symbol called $global$ in the $GLOBAL$ subspace. This code also loads the address of this symbol into the %dp register before beginning program execution. This register must not be changed during the execution of a program. Since the %dp register is known to contain the address of $global$ , the following single instruction does the load as long as the displacement from $global$ to the desired location is less than 8 kilobytes:

LDW var-$global$(%dp),%r3

If the desired location is not known to be close enough to $global$ , use the following sequence:

Example 3-1 Global Symbol Usage

ADDIL    L'var-$global$,%dp            ;result in r1
LDW      R'var-$global$(%r1),%r3

For convenience, the $SHORTDATA$ and $SHORTBSS$ subspaces can be used for small scalar variables. Most scalar variables are close enough to $GLOBAL$ so that the shorter form can be used. Arrays and large structures should be defined in $DATA$ and the long form used.

To access items in the $PRIVATE$ space (global data), the following does not work:

LDIL      L'var,%r1                   ;wrong
LDW       R'var(%r1),%r3              ;wrong

This example assumes that the operating system always allocates data at the same virtual space offset 0x40000000.

Thread local storage (TLS) data is accessed relative to control register 27 (%cr27). The contents of %cr27 must first be moved to a general register by using the MFCTL instruction. A symbol, __tp, is defined, similar to $global$ . The following code shows the loading of the TLS variable. Note the similarities between this example and the example Example 3-1 “Global Symbol Usage”.

MFCTL    %cr27, &rx
ADDIL    L'var-__tp,%rx            ;result in r1
LDW      R'var-__tp(%r1),%r3

Uninitialized areas in the data space can be requested with the .COMM (common) request. These requests are always made in the $BSS$ subspace in the $PRIVATE$ space. The $BSS$ subspace should not be used for any initialized data. Common requests are passed on to the linker, which matches up all requests with the same name and allocates a block of storage equal in size to the largest request. If, however, an exported data symbol is found with the same name, the linker treats the common requests as if they were imports.

HP FORTRAN 77/HP-UX common blocks are naturally allocated in this way: if a BLOCK DATA subprogram initializes the common block, all common requests are linked to that initialized block. Otherwise, the linker allocates enough storage in $BSS$ for the common block. The HP C/HP-UX compiler also allocates uninitialized global variables this way. In C, however, each uninitialized global is a separate common request.

Compiler Conventions

Technical documentation

» Table of Contents

» Index