“…Additionally, multiple concurrent memory accesses by the accelerator become more difficult to implement, since there is no clear way to interface with data memories or to allow concurrent accesses. Alternatively, accelerators may be loosely coupled as peripherals [34,57,72,83], using interfaces such as buses, dedicated links, or shared memory schemes [44,56,58,71,94], as shown in Figure 6(b). Although these interfaces introduce larger overheads, they avoid intrusive modifications to the host processor.…”