An
Introduction to the IA-64 Architecture
Hewlett-Packard Company
11000 Wolfe Road
Cupertino, CA 95014
The
IA-64 processor architecture is the result of the joint Intel and HP work to
specify a new 64-bit architecture. Designed to expose, enhance, and exploit
instruction-level parallelism, this is the first architecture to specifically
target parallel architectures. IA-64
also exploits the parallel knowledge that is discovered by compiler
analysis. This is in contrasts to traditional
RISC processors that must rediscover the parallelism through dynamic hardware.
This talk will introduce the
key three concepts that distinguish IA-64 from RISC designs. First, is the
ability to communicate directly to the hardware the available parallelism in
the instruction stream. The instruction
set has bits in each instruction group to specify the independence of the
instruction stream. This simplifies the instruction fetch and dispatch units of
the processor. What the compiler knows
is parallel (independent) can be communicated explicitly to the processor and
avoid unnecessary hardware.
Second, the architecture
defines a massive amount of processor registers to ensure that the machine is
not starved for resources. 128 integer
registers and 128 floating-point registers define the largest general register
set in general-purpose computer architectures today. Not providing sufficient resources limits the parallelism that a
compiler can expose. For example, to
maintain multiple memory requests and multiple arithmetic operations
simultaneously requires more registers.
Third, is a set of features
that enhance the instruction level parallelism (ILP) in the instruction set. Predication is one example of an
instruction set feature that removes branches and allows more instructions to
be simultaneously executed. Each
instruction specifies one of 64 ‘predicate’ registers that control the
execution of the instruction. Once an instruction is predicated then it is no
longer necessary to branch around blocks of code. It is something like computing all the possible answers and then
just keeping the one you need.
Data and control
speculation, register rotation, register stacking, the register save engine,
the floating-point architecture and the multimedia architecture are all
features that enhance the ability of the instruction set to expose
parallelism. The compiler is an
integral part of this process to unlock the performance potential of these
features and maximizes the efficiency of the hardware.