Abstract-Personalized healthcare devices enable low-cost, unobtrusive and long-term acquisition of clinically-relevant biosignals. These appliances, termed Wireless Body Sensor Nodes (WBSNs), are fostering a revolution in health monitoring for patients affected by chronic ailments. Nowadays, WBSNs often embed complex digital processing routines, which must be performed within an extremely tight energy budget. Addressing this challenge, in this paper we introduce a novel computing architecture devoted to the ultra-low power analysis of biosignals. Its heterogeneous structure comprises multiple processors interfaced with a shared acceleration resource, implemented as a Coarse-Grained Reconfigurable Array (CGRA). The CGRA mesh effectively supports the execution of the intensive loops that characterize bio-signal analysis applications, while requiring a low reconfiguration overhead. Moreover, both the processors and the reconfigurable fabric feature Single-Instruction / MultipleData (SIMD) execution modes, which increase efficiency when multiple data streams are concurrently processed. The run-time behavior on the system is orchestrated by a light-weight hardware mechanism, which concurrently synchronizes processors for SIMD execution and regulates access to the reconfigurable accelerator. By jointly leveraging run-time reconfiguration and SIMD execution, the illustrated heterogeneous system achieves, when executing complex bio-signal analysis applications, speedups of up to 11.3x on the considered kernels and up to 37.2% overall energy savings, with respect to an ultra-low power multicore platform which does not feature CGRA acceleration.