An Introduction to the IA-64 Architecture

 

Jerry Huck

Hewlett-Packard Company
11000 Wolfe Road
Cupertino, CA 95014

 

Extended Abstract

 

The IA-64 processor architecture is the result of the joint Intel and HP work to specify a new 64-bit architecture. Designed to expose, enhance, and exploit instruction-level parallelism, this is the first architecture to specifically target parallel architectures.  IA-64 also exploits the parallel knowledge that is discovered by compiler analysis.  This is in contrasts to traditional RISC processors that must rediscover the parallelism through dynamic hardware.

 

This talk will introduce the key three concepts that distinguish IA-64 from RISC designs. First, is the ability to communicate directly to the hardware the available parallelism in the instruction stream.  The instruction set has bits in each instruction group to specify the independence of the instruction stream. This simplifies the instruction fetch and dispatch units of the processor.  What the compiler knows is parallel (independent) can be communicated explicitly to the processor and avoid unnecessary hardware.

 

Second, the architecture defines a massive amount of processor registers to ensure that the machine is not starved for resources.  128 integer registers and 128 floating-point registers define the largest general register set in general-purpose computer architectures today.  Not providing sufficient resources limits the parallelism that a compiler can expose.  For example, to maintain multiple memory requests and multiple arithmetic operations simultaneously requires more registers.

 

Third, is a set of features that enhance the instruction level parallelism (ILP) in the instruction set. Predication is one example of an instruction set feature that removes branches and allows more instructions to be simultaneously executed.  Each instruction specifies one of 64 ‘predicate’ registers that control the execution of the instruction. Once an instruction is predicated then it is no longer necessary to branch around blocks of code.  It is something like computing all the possible answers and then just keeping the one you need.

 

Data and control speculation, register rotation, register stacking, the register save engine, the floating-point architecture and the multimedia architecture are all features that enhance the ability of the instruction set to expose parallelism.  The compiler is an integral part of this process to unlock the performance potential of these features and maximizes the efficiency of the hardware.

Author | Title | Track | Home

Send email to Interex or to the Webmaster
©Copyright 1999 Interex. All rights reserved.