As mentioned earlier, there are three types of procedure calls that can be
executed, and they can be classified into two groups: intra-module (local) and
inter-module (external) calls. Basically, a local call is one in which the
caller and the callee reside in the same load module, and an external call is
one in which that is not necessarily the case. There is one exception to this
definition, however: calls which cross privilege level boundaries are always
treated as external calls, even if the caller and callee reside in the same
load module. Although external calls are closely related to local calls,
several notable differences exist in storage and access conventions; these
differences are explained in the following material.
The inter-module (external) calling sequence is distinguished from the
intra-module (local) calling sequence to take advantage of the system-wide
global virtual addressing of the PA-RISC processor and to implement code
sharing. Unlike most conventional systems where each process has a private
virtual address space, all processes in a PA-RISC system share a finite,
although large, number of spaces (2^16 currently, 2^32 potentially). Therefore,
it is undesirable to assign virtual spaces to inactive program images (i.e.
those on disk). Generally, the assignments of virtual spaces to a program will
be delayed until the program is activated.
In order to avoid extensive linking at process creation time, it is desirable
to have a central data structure (for each process) that will hold the SIDs
assigned to the load modules used by the process. This data structure is the
XRT (Inter-Module Cross Reference Table). Only the entries in the XRT need to
be updated when virtual spaces are finally assigned to load modules.
Another use for the XRT is the location of the global data area of a load
module. The offset of the global data area in the private data space of a
process depends on the combination of load modules called by the process.
Although each load module's code is assigned a unique SID, its data is placed
at a process-dependent offset within the process' single private data space.
In order to share a load module between multiple processes, all references to
the global data area must be relative to the base register, DP
(data pointer). The value of DP is stored in a load module's entry in
the XRT. The XRT will be discussed and diagramed in detail below.
An external call uses the same code sequence as a local call, with the addition
of a "calling stub" (a.k.a. Import Stub) attached to the calling code, and a
"called stub" (a.k.a. Export Stub) attached to the called code. Unlike a local
call, execution is not transferred directly from the caller to the callee; in
an external call, a millicode sequence (CALLX) is employed to locate and
branch to the target procedure. These stubs and the millicode sequence are
discussed in greater detail below.
5.1.1 Requirements of an External Call on MPE XL
An external procedure call requires several extra steps in addition to those
necessary for a local call, as outlined below:
The base register pointing to the global data area (DP), the SID
contained in sr4, and value in gr2 (RP) must all be saved in the
caller's allocated frame area. This RP value is referred to as RP' to
distinguish it from the usual RP value associated with a local call.
The difference between RP and RP' is as follows:
RP is renamed RP' during the calling stub execution, at which time RP
becomes the location in the called stub to which the target procedure
(callee) must return. Finally, RP' is renamed back to RP, and DP and
the SID are restored.
The called load module's entry in the XRT must be located.
The short pointer from the XRT entry must be loaded into DP (this
is the DP value for the called load module).
The SID from the XRT entry must be loaded into sr4, and the offset
of the callee must be obtained from the XRT.
An external branch and link to the called procedure must be
performed. (Actually, an external branch to the CALLX millicode
routine that locates the called stub and a branch to the called stub
of the callee, which then does a local branch and link to the callee.
Stubs, CALLX, and linking will be discussed further in the section,
Import/Export Stubs.)
5.1.2 Requirements of an External Return
In order to return from an external call, it is required that DP,
sr4, and RP be restored. The saved value of RP' will be
loaded into RP (gr2), and the return to the caller will be via that
register. The DP and SID of the caller will be restored from the
caller-save area.
5.1.3 Control Flow of an MPE XL External Call
Figure 5-1 shows a simplified external procedure call. It uses the same code
sequence as the local, but a "calling stub" is attached to the calling code
(the caller) and a "called stub" is attached to the entry point of the called
code (the callee). Execution is transferred from the calling code to the called
code by executing a millicode sequence (CALLX) which uses an XRT to locate and
branch to the target procedure. Note: All of these elements of an external call
are covered in more detail in the following subsections.
To read the diagram, follow the arrows and numbers, beginning with number 1.
Figure 5-1 Simplified External Procedure Call
5.1.4 Calling Code
The calling code generated by the compiler to perform a procedure call will be
the same regardless of whether the call is local or external. If the linker
locates the procedure being called within the current executable load module,
it will make the call local by patching the BL instruction to directly
reference the entry point of the procedure. If the linker is unable to locate
the procedure, it will make the call external by attaching a calling stub to
the calling code, and patch the BL instruction to branch to the stub.
Before the call, the calling code must save any caller-saves registers that
contain active data. The parameter list for the callee is stored in the current
stack frame between the register spill area and the frame marker. As in a local
call, the parameter list is stored in reverse order, such that the first word
is at SP-36, the second is at SP-40, etc. Note: the first four words of the
parameter list are passed in registers, but the space for the argument list is
allocated in the stack frame, even if the first four words are unused. Also, by
default all parameters are passed in general registers, with the linker
including any necessary relocation stubs. See Chapter 3, Parameter Relocation.
5.1.5 Called Code
The called code is responsible for allocating a new stack frame on the top of
the stack (the frame must be 64-byte aligned); the actual size of the frame
will be determined by the compiler, and will be the summation of:
The amount of space needed by the register allocator for the
register spill area;
The amount of space needed for the local variables of the current
procedure;
The amount of space needed to store the longest parameter list of
any procedure called by this procedure; and
The 32-byte frame marker.
If this procedure is callable by a less-privileged procedure, each page of the
stack frame must be PROBEd (a privilege-checking mechanism) before any
information is stored into the frame. The PROBE instructions must be generated
by the compiler. (This is not currently implemented).
When a procedure is entered, gr2 (RP) will contain the offset portion of the
return address. Whether the procedure was called locally or externally, the two
low order bits of RP will always contain the execution level of the caller.
From the source code level it is difficult, if not impossible, to determine if
RP is valid or if it has been stored into the stack frame. Therefore, compilers
that support multiple privilege levels need to provide a mechanism for
returning the execution level of the caller.
If the current procedure calls another procedure, RP must be saved sometime
before the call, usually in the procedure entry sequence. The called code is
responsible for insuring the validity of its own input. sr5, sr6, and sr7 can
only be set by privileged code and, therefore, can be assumed to be correct at
all times. In addition, DP, LP (linkage pointer), and sr4 are not changed
during a local call, or they are set by the procedure call millicode during an
external call and therefore may also be assumed to be correct. The value of SP
and the parameters passed must be validated by the called code.
Only those fields necessary in the frame marker of the current procedure will
contain valid data; others will be undefined. For example, during a local call,
contents of the external return link pointer field will be undefined.
5.1.6 Import and Export Stubs
As previously mentioned, the compiler only generates a single type of procedure
call (local), a characteristic that is made possible by the use of stubs. Stubs
are pieces of code that are attached to the caller and/or callee that enable
the original calling and called code to remain unchanged through the external
call process. There are two types of stubs used in this procedure calling
convention: Import (calling) and Export (called). These are defined here, and
explained in detail in the following sections.
Import Stub (Calling Stub) – a locally-linked stub that
enables inter-module and OS/Subsystem calls to appear (to the
compiler) as local calls. If the linker determines that a call is
external, it will attach a calling stub to the procedure and will
patch the BL (Branch and Link) instruction to branch to the stub.
There is usually one calling stub for each procedure (and privilege
level combination) referenced in the module (which can accommodate
all calls to a specific procedure), but it is possible to have a
separate stub for each CALL to a procedure.
Export Stub (Called Stub) – enables a called routine to
avoid the problem of having multiple return sequences (i.e. different
for local and external calls). There is one called stub for each
external procedure of a load module. Inter-module calls will enter
the called stub, which in turn will enter the called procedure
(callee). Thus, the callee can return to its called stub (which is
local) rather than being concerned with the external return. The
calling stubs can be generated by the linker (or obtained from a
"stub library") and then linked to their respective routines.
5.1.7 Import Stubs
The import stub will load gr1 with a pointer to the procedure XRT table entry
(XRT pointer) of the called procedure and then branch to an external procedure
call millicode sequence. Since the location of the XRT for a load module may be
different for separate executions of the load module, the XRT entry pointer
will be computed in the import stub. The XRT entry pointer is computed by
adding the XRT entry offset to the value of the LP, which is stored at DP-4,
pointing to the base of the XRT for the current load module.
For permanently bound calls to the operating system, an import stub is not
necessary; instead, the BL instruction in the call is replaced with a BLE
instruction that branches to a system entry point branch table. This eliminates
much of the linking that is normally performed when a load module is loaded.
(This is currently not implemented.)
Although the external procedure call diagram (Figure 5-5) shows that DP, RP',
and sr4 are saved by the CALLX millicode (see next section for discussion of
CALLX), DP and RP' will actually be saved in the import stub, and sr4 will be
copied to a general register in the import stub. This is done to eliminate two
interlocks and fill a branch delay slot that would otherwise be left unused.
The code sequence of the calling stub will be similar to that shown below:
LDW -4(DP),grl ; Load LP
STW DP,-32(SP) ; Save DP
ADDIL * L'XRToff,grl ; Add XRT offset to LP
LDO * R'XRToff(grl),grl ;
LDW 16(grl),gr20 ; Load address of CALLX
STW RP,-24(SP) ; Save RP'
BE (sr7,gr20) ; Branch to CALLX
MFSP sr4,gr21 ; Move sr4 to gr21
* Can be eliminated by the linker in cases where they would effectively be NOPs.
In these cases the total size of the stub is padded to 8 words because unwind
descriptors assume fixed-length stubs.
5.1.8 External Procedure Call Millicode (CALLX)
The CALLX millicode sequence is primarily a transition mechanism that
facilitates the successful location and access of the desired external routine.
The address of the CALLX routine is obtained from the XRT entry, and is
assigned by the loader. Seven variations of CALLX are available, depending on
the possible privilege promotions. CALLX is called from the calling stub, and
operates as follows:
Saves DP, RP', and sr4 (if necessary).
Alters the privilege level if necessary (Gateway).
Checks the XRT pointer to insure that it points to a valid XRT
table entry.
Loads the LP, DP, Offset and sr4 (of new procedure) values.
Branches to the called stub in the external module.
5.1.9 Export Stubs
An export stub is used to allow the compiler to generate the same exit code
sequence regardless of whether the procedure will be called locally or
externally. If the linker determines that a procedure can be called from
another load module, it will attach a called stub to the procedure. The stub
will be the external entry and return point for the procedure. Local calls to
the procedure will be unaffected.
Although the stub is the external entry point, its primary purpose is to be
executed during an external procedure call exit/return. The stub is entered
before the procedure so that RP can be set to the address of the stub, which
will cause the local return in the procedure to exit to the stub. When the stub
is executed during the return, it will restore DP and sr4, and return to the
caller.
The stub executes at the caller's execution level.
The code sequence for the called stub will be similar to the following:
BL disp,gr2 ; Branch to local entry point
DEP gr31,31,2,gr2 ; Deposit caller's Exec. Level in link
LDW -28(SP),gr21 ; Restore sr4 (part 1)
LDW -24(SP),gr2 ; Restore return address (RP')
MTSP gr21,sr4 ; Restore sr4 (part 2)
BE 0(sr4,gr2) ; Branch back to caller
LDW -32(SP),gr27 ; Restore DP
5.1.10 Inter-Module Cross Reference Table (XRT)
The Cross Reference Table (XRT) is used to link the external procedure calls of
a load module. Every process has an XRT area reserved from its process space
(pointed to by sr5). The table contains a sub-table for each load module
referenced during process execution. Each sub-table for a load module contains
entries for all the procedures called by that load module. A sample XRT is
shown in Figure 5-2 (in this example, the process has two load modules: 'A'
and 'B'. 'A' calls procedures B1, B2 and B3; 'B' calls procedures A1 and A2).
5.1.11 The Layout of the XRT
One XRT might be visualized as shown in Figure 5-2. (This diagram corresponds
to the calling situation described in the last sentence of the previous
section.)
Figure 5-2 Layout of the XRT
5.1.12 Sub-table Header
Each sub-table of a load module in the XRT contains an eight word header used
to locate unwind tables for the module. See Chapter 7, Stack Unwinding. The
sub-table header, as shown in Figure 5-3, contains the following information:
(The first four words are presently undefined and are reserved for
future use.)
The starting address of the Unwind table.
The starting address of the Linker Stub Unwind table.
The starting address of the Recover table.
The starting address of the Auxiliary Unwind table.
Figure 5-3 Sub-table Header
If a load module contains no external references, its sub-table in the XRT will
contain only the header.
An entry for a procedure within a sub-table of a load module in the XRT (e.g.
the entry for B1) is eight words long, and contains the following information:
The SID of the module to which it belongs.
The entry offset for the procedure. This is a 32-bit offset, and
is the address of the entry point (relative to the base of the SID of
its load module) of the procedure's called stub. The last two bits
(30 and 31) of this word must be zero in order to insure word
alignment of the address.
The DP value for the load module to which it belongs (the value of
the base register pointing to the load module's global data
area).
The LP value of the module in which the called procedure is
contained. This is a pointer to the beginning of the XRT sub-table of
that load module.
The address of the CALLX millicode routine.
7. 8. (These three words are presently undefined and are reserved
for future use.)
The XRT entry for procedure B1 (which is called from A and would appear in A's
sub-table of the XRT) is shown in Figure 5-4.
Figure 5-4 One XRT Entry
** The last two bits of the Entry Point Offset must be set to zero in order to
ensure word alignment.
5.1.13 Linkage Pointer
A single value, the Linkage Pointer, resides in the word directly below the
global data area of a load module, at the location pointed to by DP-4. This
pointer is private to the load module, and is a short pointer to the beginning
of the load module's XRT sub-table. An entry for a called procedure in the XRT
is pointed to as follows:
The LP points to the beginning of the sub-table in the XRT of the
load module containing the called procedure.
The import stub for the caller has the offset to the called
procedure's entry relative to the XRT sub-table of the caller's load
module. This offset, added to the LP value, provides a pointer to the
called procedure's entry. This LP-relative XRT offset is assigned by
the linker.
The reason for the indirection employed by using the LP is that
load modules can be shared by different processes whose XRTs may also
be different. To allow the same code to reference the same load
module in different processes' XRTs, it is necessary to provide a
uniform interface to the XRT entries; this is provided by the
LP.
In addition to the XRT area in the process space, there is an XRT area in the
system space (pointed to by sr7) that is reserved for the XRTs of system load
modules. Like any other load module, a system load module also uses LP to
locate its XRT. The system XRT area can also contain a special XRT that is used
for calling system procedures by intrinsic number.
5.1.14 System Security and the XRT
The XRT will be set up by the loader. The values in the XRT will be supplied by
the loader, based on mapping the files relevant to the process into virtual
memory (i.e. SID allocation, the data offsets in sr5 space, etc.). The linker
may provide some of the values that are to be contained in sr5, based on the
information it may have at link time concerning the specific load modules that
are involved in the process' executable image.
When a process is loaded, the loader will protect all the pages in the XRT to
read level 3, write level 0. Although it is not necessary, the process
protection ID will be assigned to the pages of the process XRT area. Since the
LP is located with the process' global data area, its protection is the same as
that load module's global data area. It is not necessary to validate LP because
it will be set by CALLX at every external call or privilege level change.
The XRT of every process and the system XRT must be at the same offset of their
corresponding quadrant, and every XRT must be the same length. These two
restrictions allow the procedure call millicode (CALLX) to use a very simple
masking algorithm to perform bounds checking on any XRT pointer used with an
external call. The location and size of the area can be changed when the system
is restarted, but the new values must be reflected in the procedure call
millicode (because it uses constant values to do bounds checking on the XRT
entry pointer).
5.1.15 Interface Between Import and Export Stubs
The exact distribution of all operations between the import and export stubs,
and whether the export stub uses a centralized system routine to accomplish
these tasks, is not architected. Much more important, and specified in detail,
is the work that the export stub expects to have been done before it is entered.
Adhering to these requirements facilitates the loading of DP, LP, and SID
(if desired) of the called load module. These requirements are as follows:
gr1 must contain a pointer to the called procedure's XRT entry.
(This is actually the called procedure's XRT entry in the sub-table
of the calling load module.)
Recall that the caller's LP points to the caller's XRT sub-table,
which contains entries for all of the routines that may be called.
The offset into that sub-table, which indexes to the called routine's
entry, is bound as an immediate in the import stub. The pointer to
the specific XRT entry is calculated as follows:
(caller's LP value) + (offset to called procedure's entry) = (pointer
to callee's entry in XRT sub-table of caller)
This pointer is the value that should be found in gr1 when the export
stub is entered.
The SID of the called load module must be loaded into sr4. (The
export stub is free to check the validity of that SID, to reload it,
or to leave it as is. The important point is the assumption that it
has already been loaded, and DOES NOT need to be checked.)
5.1.16 Summary of an External Procedure Call
Figure 5-5 shows a detailed picture of the flow of control associated with an
external procedure call.
To read the diagram, begin at the upper left-hand corner ("Calling Code"), and
read downward; whenever an arrow extends from a line, follow it, and continue
downward in the box from the point where the arrow ends.
Figure 5-5 Summary of External Procedure Call
5.1.17 Dynamic Linking
Dynamic linking is the process of run-time linking to routines. Dynamic linking
is required when the target procedure is unknown at compile time, or the target
of a procedure call can change while the code is executing. Dynamic linking is
carried out through explicit protocols (e.g. the HPGETPROCPLABEL intrinsic on
MPE XL).
If the dynamically linked routine resides in a load module that has not yet
been loaded, the load module is loaded dynamically. In order to dynamically
load a load module, a global data area for it may need to be allocated in sr5
space. This data space is allocated by the loader, and may be allocated from
any unused virtual space in sr5.
5.1.18 Procedure Labels
A procedure label is a specially-formatted variable that is used to link
dynamic procedure calls. The format of a procedure label is shown below:
Procedure Label
The X field in the address section of the procedure label is the XRT flag,
which is used by compilers to determine if the procedure label is local (off)
or external (on). In the case of a local procedure label, the address part will
be a pointer to the entry point of the procedure, while in the external case,
the address part will be a pointer to an XRT entry for the procedure.
For external procedure labels, the linker supplies an LP offset. Generated code
must optionally add LP to this value to produce an XRT address.
The L field in the address part of the procedure label is the shared library
flag (used only on HP-UX).
The C field is mentioned for completeness. It is only relevant when the X field
is on and it is used to indicate a Compatibility Mode procedure label. CM
plabels are never called directly from native code.
The following is an example of the code generated by compilers to obtain the
address of a procedure (a procedure plabel):
LDIL L'func,1 ; get the address of the function
LDO R'func(1) ,31 ;
EXTRU,= 31,31,1,19 ; check the X-field to determine
; if XRT address
LDW -4(0,27) ,19 ; if it's an XRT address
ADD 31,19,20 ; add in LP
In the current implementation on MPE XL, the L-field is never turned "on" and
is, therefore, effectively unused. In the future, either the specification or
the implementation may change to use this field.
The dynamic procedure call millicode, $$dyncall, determines if a
procedure label is local or external, and takes the appropriate action. (A
local procedure label can only be used to call procedures within the current
load module.) The following pseudo-code sequence demonstrates the process used
for dynamic calls. Note the similarity between this sequence and the import
stub sequence:
IF (X-field in Plabel) = 0 THEN
Branch Vectored using Plabel
ELSE BEGIN
Clear X-field;
Save DP;
Load address of CALLX;
Save RP';
Move sr4 to gr21;
Branch to CALLX;
END.
The X and L flags must be zero during an external call, or they will cause a
misaligned data reference trap when accessing the XRT. (As mentioned earlier,
the L flag is currently unused on MPE XL, so it is assumed to be zero.)
An external procedure label can be used in conjunction with the external
procedure call millicode to call any procedure within the process or the
operating system (subject to XLeast checking to insure adequate execution
level). The procedure call millicode only uses the address part of the
procedure label, but it may point to either the process or system XRT.
The intrinsic HPGETPROCPLABEL may be used to get an external procedure label
for any level 1 procedure in a process. If the compiler or linker determines
the need for an external label, it is communicated to the loader by a normal
import request or an explicit call to HPGETPROCPLABEL.
Although a procedure label pointing to a system XRT entry is valid for all
processes, it will be unloaded when its reference count drops to zero.
Therefore, these procedure labels should not be considered as global procedure
labels. The procedure HPGETSYSPLABEL will return a global procedure label for
any procedure in a system load module, but it requires privilege level 1 to be
called.