Advancements in CMOS technology enable the integration of a huge number of resources on the same systemon-chip. Managing the consequent growing complexity, including fault tolerance issues in deep submicron technologies, is a hard challenge for hardware designers. Self-organization may represent a viable path toward the development of massively parallel architectures in current and future technologies. This approach is progressively more studied in multiprocessor architectures where, however, a further mind-set shift in terms of programming paradigm is required. In this article, self-organization and self-adaptiveness are exploited for the design of a coprocessing unit for array computations, supporting floating-point arithmetic. From the experience of previous explorations, an architecture embodying some principle of swarm intelligence to pursue adaptability, scalability, and fault tolerance is proposed. The architecture realizes a loosely structured collection of hardware agents implementing fixed behavioral rules aimed at the best exploitation of the available resources in whatever kind of context without any hardware reconfiguration. Comparisons with off-the-shelf very long instruction word (VLIW) digital signal processors (DSPs) on specific tasks reveal similar performance thus not paying the improved robustness with performance. The multitasking capabilities, together with the intrinsic scalability, make this approach valuable for future extensions as well, especially in the field of neuronal networks simulators.