Abstract-We propose a technique to localize computation in Instruction Set Extensions (ISEs) that are clocked at very high speed with respect to the processor. In order to save power, data to and from Custom Instruction Units (CIUs) is synchronized via an optical signal that is detected through a Single-Photon Avalanche Diode (SPAD) capable of timing uncertainties as low as 50 ps.The CIUs comprise a free-standing local oscillator serving a computing area of a few tens of square micrometers, thus resulting in extremely reduced power dissipations, since the distribution of a high frequency clock over long distances is avoided. This approach is based on the globally asynchronous locally synchronous concept, whereby the granularity of the local domains is reduced to a minimum, thus enabling extremely high local clock frequencies and low power, while minimizing substrate noise injection and intra-chip interference.Thanks to this approach we can free ourselves from expensive synchronization techniques such as FIFOs, delays, or ip-op based synchronizers by creating xed synchronization points in time where data can be exchanged. The paradigm is demonstrated on a chip designed and fabricated in a standard 90 nm CMOS technology. A full characterization demonstrates the suitability of the approach.Index Terms-Clock distribution, embedded systems, globally asynchronous locally synchronous (GALS), instruction set extensions (ISEs), optical clocking, optically clocked ISEs, single-photon avalanche diode (SPADs).