Modern programming technique encourages programmers to practice well-structured
decomposition, which entails the use of a greater number of smaller, more
specialized procedures rather than larger, more complex routines. While this
creates more adaptable and understandable programs, it also increases the
frequency of procedure calls, thus making the efficiency of the procedure
calling convention crucial to overall system performance.
Many modern machines provide instructions to perform many of the tasks
necessary to make a procedure call, but this is not the case in Precision
Architecture RISC (PA-RISC). Instead of using an architected mechanism, the
procedure call is accomplished through a software convention which uses the
machine's simple hardwired instructions, a solution that ultimately provides
more flexibility and efficiency than the more complex (microcoded) instruction
set additions.
Besides the obvious branch-and-return interruption that occurs in the flow of
control as a result of a procedure call, many other provisions must be made in
order to achieve an effective calling convention. The call mechanism must also
pass parameters, save the caller's environment, and establish an environment
for the called procedure (the callee). The procedure return
mechanism must restore the caller's previous environment and save any return
values.
Although PA-RISC machines are essentially register-based, by convention a stack
is necessary for data storage. As a basis for discussion of the Procedure
Calling Convention, we will first examine a straightforward calling mechanism
in this environment, one in which the calling procedure (caller)
acquires the responsibility for preserving its own state. This simplified model
employs the following steps for each call:
NOTE: These steps are NOT the exact implementation used in the
PA-RISC, but are given as a general basis for the discussion of the actual
Procedure Calling Convention that will follow.
Save all registers whose contents must be preserved across the
procedure call. This prevents the callee, which will also use and
modify registers, from affecting the caller's state. On return, those
register values are restored.
Evaluate parameters in order and push them onto the stack. This
makes them available to the callee, which, by convention, knows how
to access them.
Push a frame marker, which is a fixed-size area containing several
pieces of information. Included is the static link, which provides
information needed by the callee in order to address the local
variables and parameters of the caller, as well as the return address
of the caller.
Branch to the entry point of the callee.
And to return from a call in this model, it is necessary that:
The callee extract the return address from the frame marker and
branch to it, and
The caller then remove the parameters from the stack and restore
all saved registers before the program flow continues.
This model correctly implements the basic steps needed to execute a procedure
call, but is relatively expensive. The caller is forced to assume all
responsibility for preserving its state, which is a conservative and safe
approach, but causes an excessive number of register saves to occur. To
optimize the program's execution, the compiler makes extensive use of registers
to hold local variables and temporary values; these registers must all be saved
at a procedure call and restored at the return. A high overhead is also
incurred by the loading and storing of parameters and linkage information. The
procedure call convention implemented in PA-RISC focuses on the need to reduce
this expense by maximizing register usage and minimizing direct memory
references.
PA-RISC compilers attempt to alleviate this problem by introducing a procedure
call mechanism that divides the register sets into partitions.
The registers are partitioned into caller-saves (the caller is
responsible for saving and restoring them), callee-saves (the
callee must save them at entry and restore them at exit), and
linkage registers. In the general purpose register set, sixteen
of the registers comprise the callee-saves partition and thirteen are available
for use as caller-saves registers.
Thus the responsibility for saving registers is divided between the caller and
the callee, and some registers are also available for linkage. The
floating-point registers and space registers are also partitioned in a similar
manner.
The register allocator avoids unnecessary register saves by using caller-saves
registers for values that need not be preserved across a call, while values
that must be preserved are placed into registers from the callee-saves
partition. At procedure entry, only those callee-saves registers used in the
procedure are saved; this minimizes the number and frequency of register loads
and stores during the course of a call. If more registers are needed from a
particular partition than are available, registers can be borrowed from the
other partition. The penalty for using these additional registers is that they
must be saved and restored, but this overhead is incurred only in the special
circumstance where excess registers are needed, which happens relatively
infrequently.
In the simple model outlined above, all parameters are passed by being placed
on the stack, which is expensive because direct memory references are needed in
order to push each parameter. In PA-RISC procedure calling convention, this
problem is lessened by the compilers, which allocate a permanent parameter area
(in memory) large enough to hold the parameters for all calls performed by the
procedure, and minimize memory references when storing parameters by using a
combination of registers and memory to pass parameters. Four registers from the
caller-saves partition are used to pass user parameters, each holding a single
32-bit value or half of a 64-bit value. Since procedures frequently have few
parameters, the four registers are usually enough to accommodate them all. This
removes the necessity of storing parameter values in the parameter area before
the call. If more than four 32-bit parameters are passed, the additional ones
are stored in the preallocated parameter area, or if a parameter is larger than
64 bits, its address is passed and the callee copies it to a temporary area.
Additional savings on memory access are gained when the callee is a leaf
procedure (one that does not make any other calls). In this situation, the
register allocator uses the caller-saves registers to hold variable values,
thus eliminating the need to save callee-saves registers that it might have
used in a non-leaf procedure. Furthermore, since a leaf procedure will not make
subsequent procedure calls, there is no need to allocate a stack frame for it,
because the return address and other values can remain in registers during the
entire life of the call. (Actually, there are rare exceptions to this; a stack
frame may be necessary for a leaf procedure if more local space is needed than
is available in registers.)