Dedicated adder calculates effective addresses (EAs)
Supports store gathering
Performs alignment, normalization, and precision conversion for floating-point data
Executes cache control and TLB instructions
Performs alignment, zero padding, and sign extension for integer data
Supports hits under misses (multiple outstanding misses)
Supports both big- and little-endian modes, including misaligned little-endian
accesses
• Three issue queues FIQ, VIQ, and GIQ can accept as many as one, two, and three
instructions, respectively, in a cycle. Instruction dispatch requires the following:
– Instructions can be dispatched only from the three lowest IQ entries – IQ0, IQ1, and
IQ2
– A maximum of three instructions can be dispatched to the issue queues per clock
cycle
– Space must be available in the CQ for an instruction to dispatch (this includes
instructions that are assigned a space in the CQ but not in an issue queue)
• Rename buffers
– 16 GPR rename buffers
– 16 FPR rename buffers
– 16 VR rename buffers
• Dispatch unit
– Decode/dispatch stage fully decodes each instruction
• Completion unit
– The completion unit retires an instruction from the 16-entry completion queue (CQ)
when all instructions ahead of it have been completed, the instruction has finished
execution, and no exceptions are pending
– Guarantees sequential programming model (precise exception model)
– Monitors all dispatched instructions and retires them in order
– Tracks unresolved branches and flushes instructions after a mispredicted branch
– Retires as many as three instructions per clock cycle
• Separate on-chip L1 Instruction and data caches (Harvard Architecture)
– 32 Kbyte, eight-way set-associative instruction and data caches
– Pseudo least-recently-used (PLRU) replacement algorithm
– 32-byte (eight-word) L1 cache block
– Physically indexed/physical tags
– Cache write-back or write-through operation programmable on a per-page or per-
block basis
6
PC7457
5345D–HIREL–07/06