Abstract. Embedded devices have hard performance targets and severe power and area constraints that depart significantly from our design in tuitions derived from general-purpose microprocessor design. This paper describes our initial experiences in designing Synchroscalar, a tile-based embedded architecture targeted for multi-rate signal processing applica tions.We present a preliminary design of the Synchroscalar architecture and some design space exploration in the context of important signal process ing kernels. In particular, we find that synchronous design and substan tial global interconnect are desirable in the low-frequency, low-power do main. This global interconnect enables parallelization and reduces pro cessor idle time, which are critical to energy efficient implementations of high bandwidth signal processing. Furthermore, statically-scheduled communication and SIMD computation keep control overheads low and energy efficiency high.