ADSP-21369
Preliminary Technical Data
GENERAL DESCRIPTION
The ADSP-21369 SHARC processor is a members of the SIMD
SHARC family of DSPs that feature Analog Devices' Super Har-
vard Architecture. The ADSP-21369 is source code compatible
with the ADSP-2126x, and ADSP-2116x, DSPs as well as with
first generation ADSP-2106x SHARC processors in SISD (Sin-
gle-Instruction, Single-Data) mode. The ADSP-21369 is a 32-
bit/40-bit floating point processors optimized for high perfor-
mance automotive audio applications with its large on-chip
SRAM, and mask-programmable ROM, multiple internal buses
to eliminate I/O bottlenecks, and an innovative Digital Audio
Interface (DAI).
• On-Chip mask-programmable ROM (6M bit)
• JTAG test access port
The block diagram of the ADSP-21369 on Page 1 also illustrates
the following architectural features:
• DMA controller
• Eight full duplex serial ports
• Digital audio interface that includes four precision clock
generators (PCG), an input data port (IDP), an S/PDIF
receiver/transmitter, eight channels asynchronous sample
rate converters, eight serial ports, eight serial interfaces, a
16-bit parallel input port (PDAP), a flexible signal routing
unit (DAI SRU).
As shown in the functional block diagram on Page 1, the
ADSP-21369 uses two computational units to deliver a signifi-
cant performance increase over the previous SHARC processors
on a range of DSP algorithms. Fabricated in a state-of-the-art,
high speed, CMOS process, the ADSP-21369 processor achieves
an instruction cycle time of 2.5 ns at 400 MHz. With its SIMD
computational hardware, the ADSP-21369 can perform 2.4
GFLOPS running at 400 MHz.
• Digital peripheral interface that includes three timers, an
I2C interface, two UARTs, two serial peripheral interfaces
(SPI), and a flexible signal routing unit (DPI SRU).
ADSP-21369 FAMILY CORE ARCHITECTURE
The ADSP-21369 is code compatible at the assembly level with
the ADSP-2126x, ADSP-21160 and ADSP-21161, and with the
first generation ADSP-2106x SHARC processors. The ADSP-
21369 shares architectural features with the ADSP-2126x and
ADSP-2116x SIMD SHARC processors, as detailed in the fol-
lowing sections.
Table 1 shows performance benchmarks for the ADSP-21369.
Table 1. ADSP-21369 Benchmarks (at 400 MHz)
Benchmark Algorithm
Speed
(at 400 MHz)
1024 Point Complex FFT (Radix 4, with reversal) 23.25 µs
SIMD Computational Engine
FIR Filter (per tap)1
1.25 ns
5.0 ns
The ADSP-21369 contains two computational processing ele-
ments that operate as a Single-Instruction Multiple-Data
(SIMD) engine. The processing elements are referred to as PEX
and PEY and each contains an ALU, multiplier, shifter and reg-
ister file. PEX is always active, and PEY may be enabled by
setting the PEYEN mode bit in the MODE1 register. When this
mode is enabled, the same instruction is executed in both pro-
cessing elements, but each processing element operates on
different data. This architecture is efficient at executing math
intensive DSP algorithms.
IIR Filter (per biquad)1
Matrix Multiply (pipelined)
[3x3] × [3x1]
[4x4] × [4x1]
11.25 ns
20.0 ns
Divide (y/×)
8.75 ns
13.5 ns
Inverse Square Root
1 Assumes two files in multichannel SIMD mode
The ADSP-21369 continues SHARC’s industry leading stan-
dards of integration for DSPs, combining a high performance
32-bit DSP core with integrated, on-chip system features.
Entering SIMD mode also has an effect on the way data is trans-
ferred between memory and the processing elements. When in
SIMD mode, twice the data bandwidth is required to sustain
computational operation in the processing elements. Because of
this requirement, entering SIMD mode also doubles the band-
width between memory and the processing elements. When
using the DAGs to transfer data in SIMD mode, two data values
are transferred with each access of memory or the register file.
The block diagram of the ADSP-21369 on Page 1, illustrates the
following architectural features:
• Two processing elements, each of which comprises an
ALU, Multiplier, Shifter and Data Register File
• Data Address Generators (DAG1, DAG2)
• Program sequencer with instruction cache
Independent, Parallel Computation Units
Within each processing element is a set of computational units.
The computational units consist of an arithmetic/logic unit
(ALU), multiplier, and shifter. These units perform all opera-
tions in a single cycle. The three units within each processing
element are arranged in parallel, maximizing computational
throughput. Single multifunction instructions execute parallel
ALU and multiplier operations. In SIMD mode, the parallel
ALU and multiplier operations occur in both processing ele-
• PM and DM buses capable of supporting four 32-bit data
transfers between memory and the core at every core pro-
cessor cycle
• Three Programmable Interval Timers with PWM Genera-
tion, PWM Capture/Pulse width Measurement, and
External Event Counter Capabilities
• On-Chip SRAM (2M bit)
Rev. PrB
|
Page 4 of 52
|
June 2005