In a complex instruction set computer, it is relatively easy at system design
time to make frequent additions to the instruction set based almost solely on
the desire to achieve a specific performance enhancement, and the presence of
microcode easily facilitates such developments. In a reduced instruction set
computer, however, this microcode has been eliminated because it has been shown
to be potentially detrimental to overall system performance (not only is
instruction decode complicated, but the basic cycle time of the machine may be
lengthened).
So while the functionality of these complex microcoded instructions (e.g.
string moves, decimal arithmetic) is still necessary, a RISC- based system is
confronted with a classic space-time dilemma: if the compilers are given sole
responsibility for generating the necessary sequences, the resulting in-line
code expansion becomes a problem; but if procedure calls to library routines
are used for each operation, the overhead expense incurred (i.e. parameter
passing, stack usage, etc.) is unacceptable.
In an effort to retain the advantages associated with each approach, the
alternative concept of "millicode" was developed. Millicode is PA-RISC's
simulation of complex microcoded instructions, accomplished through the
creation of assembly-level subroutines that perform the desired tasks. While
these subroutines perform comparably to their microcoded counterparts, they are
architecturally similar to any other standard library routines, differing only
in the manner in which they are accessed. As a result, millicode is portable
across the entire family of PA-RISC machines, rather than being unique to a
single machine (as is usually the case with traditional microcode).
There are many advantages to implementing complex functionality in millicode,
most notably cost reduction and increased flexibility. Because millicode
routines reside in system space like other library routines, the addition of
millicode has no hardware cost, and consequently no direct influence on system
cost. It is relatively easy and inexpensive to upgrade or modify millicode, and
it can be continually improved in the future. Eventually, it may be possible
for individual users to create their own millicode routines to fit specific
needs.
Because it is costly to architect many variations of an instruction, most fixed
instruction sets contain complex instructions that are overly general. Examples
of this are the MVB (move bytes) and MOVE (move words)
instructions on the HP3000, which are capable of moving any number of items
from any arbitrary source location to any target location. Although the desired
functionality is achieved with such generalized complex instruments, the code
that is produced often lacks the optimization that could have been achieved if
all information available at compile time had been utilized. On microcoded
machines, this information (concerning operands, alignment, etc.) is lost after
code generation and must be recreated by the microcode during each execution;
but on PA-RISC machines, the code generators can apply such information to
select a specialized millicode routine that will produce a faster run-time
execution of the operation than would be possible using a generalized routine.
For example, the move routines can execute much faster if they can assume a
specified alignment, and therefore eliminate any error checking of that type.
The size and number of millicode routines are not constrained by the
architecture or hardware considerations. This is because millicode resides in
code libraries that can be managed in the same way as other run-time libraries.
A consequence of not being bound by restrictive space considerations is that
compilers can be developed with many more specialized functions in millicode
than would be possible in a microcoded architecture, and thus are able to
create more optimal solutions for specific source code occurrences.
Millicode routines are accessed through a mechanism similar to a procedure
call, but with several significant differences. In general terms, the millicode
calling convention stresses simplicity and speed, utilizing registers for all
temporary argument storage and eliminating the need for the creation of excess
stack frames. Thus, a great majority of the overhead expense associated with a
standard procedure call is avoided, thereby reducing the cost of execution.
(However, there are exceptions to these conventions, which are discussed in
more detail throughout this chapter.)
The guidelines for the inclusion of a routine in the millicode library are not
completely determined, but the general considerations are frequency of usage,
processor expense (number of cycles necessary for execution), and size. Most
routines perform common, specific tasks (such as integer multiply or divide),
and require very little or no memory access.
5.3 PIC Requirements for Compilers and Assembly Code