Proceedings IEEE International Conference on Application- Specific Systems, Architectures, and Processors
DOI: 10.1109/asap.2002.1030716
|View full text |Cite
|
Sign up to set email alerts
|

Implementation of a 32-bit RISC processor for the data-intensive architecture processing-in-memory chip

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
15
0

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 20 publications
(15 citation statements)
references
References 11 publications
0
15
0
Order By: Relevance
“…The Field Programmable Compute Array (FPCA), the internal computation engine of HPPS, was developed to perform efficient stream processing. By modifying basic computational structures in the FPCA to support Wide Word [8] functionality, the arithmetic and memory clusters of FPCA can "morph" between stream and thread modes. MONARCH has also adapted HPPS high bandwidth I/O and fault tolerance features to facilitate sensor input as well as to enable tiling of multiple chips.…”
Section: Monarch Overviewmentioning
confidence: 99%
See 1 more Smart Citation
“…The Field Programmable Compute Array (FPCA), the internal computation engine of HPPS, was developed to perform efficient stream processing. By modifying basic computational structures in the FPCA to support Wide Word [8] functionality, the arithmetic and memory clusters of FPCA can "morph" between stream and thread modes. MONARCH has also adapted HPPS high bandwidth I/O and fault tolerance features to facilitate sensor input as well as to enable tiling of multiple chips.…”
Section: Monarch Overviewmentioning
confidence: 99%
“…The control signals required for mapping of WideWord ALU functionality onto an FPCA core-tile for thread level parallelism are provided by the RISC processor on the node. The MONARCH node thread processor is largely derived from the DIVA [6] PIM processor model [7] and thus supports single-issue, in-order execution, with 32-bit instructions and 32-bit addresses. In contrast to the dedicated WideWord Unit implemented in DIVA [8], the arithmetic cluster is a morphable unit that can be configured to operate independently as a streaming engine or under control of the threaded execution unit as a wide threaded processor.…”
Section: Monarch Overviewmentioning
confidence: 99%
“…Similarly, logic for converting to/from the internal number format and rounding logic are shared between both datapaths. DIVA execution control is a simple in-order single-issue instruction pipeline [4][8], therefore combining common datapaths does not suffer any performance penalty. The pipeline registers for the ALU and the Mul/Div blocks are controlled by separate enable signals so that only one of the datapaths is active for each instruction.…”
Section: B Monarch Fpu (Add-multiply Configuration)mentioning
confidence: 99%
“…At an architectural level, the MONARCH chip contains functional units that may serve as the central elements in a dataflow architecture for highly efficient stream computing or through morphing they may become the basis of vector extension units controlled by embedded threaded processors, such as a simple RISC design. In the latter mode, the configuration of the computational elements strongly resembles the WideWord operation of DIVA [4]. To achieve highperformance stream processing capability in MONARCH, FPU throughput should be maximized, even at the expense of area.…”
Section: Introductionmentioning
confidence: 99%
“…The DIVA WideWord Processor speeds up multimedia applications by use of data parallelism. It treats a 256-bit WideWord operand as a packed array of objects of 8, 16, or 32 bits in size [7]. DIVA PIMs support standard memory accesses and have been recently fabricated with SRAM.…”
Section: Introductionmentioning
confidence: 99%