This paper presents the block data flow architecture which is an alternative to the systolic array and the wavefront array for locally recursive algorithms such as those used for digital signal processing, image processing and matrix opemtions. The BDFA retains the advantageous features of a systolic array or a wavefront array such as regularity, modularity, and local interconnection. However, it uses a data partitioning strategy in addition to an algorithm partitioning strategy to reduce the data dependency, to decrease the interprocessor communication requirements, and to decrease. the hardware complexity compared to systolic and wave front arrays. The BDFA also uses block data processing and the block data flow paradigm to minimize system management overhead due to communication protocols. Thus, the processors in a BDFA operate efficiently to achieve a system with a high throughput and a high efficiency.