A Compact Butterfly-Style Silicon Photonic–Electronic Neural Chip for Hardware-Efficient Deep Learning

Feng, Chenghao; Gu, Jiaqi; Zhu, Hanqing; Ying, Zhoufeng; Zhao, Zheng; Pan, David Z.; Chen, Ray T.

doi:10.1021/acsphotonics.2c01188

Cited by 27 publications

(9 citation statements)

References 64 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To model the no-idealities such as fabrication errors, programming errors, crosstalk and noise, we employ an AI-assisted, hardware-aware training framework for OSNN training, with details disclosed in our previous work. [7]…”

Section: Design Of Electronic-photonic Chip For Block-circulant Osnnmentioning

confidence: 99%

A hardware-efficient silicon electronic-photonic chip for optical structured neural networks

Ning,

Gu,

Feng

et al. 2024

Optical Interconnects XXIV

Self Cite

View full text Add to dashboard Cite

Optical neural networks (ONNs) have gained significant attention as a promising neuromorphic framework due to their high parallelism, ultrahigh inference speeds, and low latency. However, the hardware implementation of ONN architectures has been limited by their high area overhead. These architectures have primarily focused on general matrix multiplication (GEMMs), resulting in unnecessarily large area costs and high control complexity. To address these challenges, we propose a hardware-efficient architecture for optical structured neural networks (OSNNs). Through experimental validation using an FPGA-based photonic-electronic testing platform, our neural chip demonstrates its effectiveness in on-chip convolution operations and image recognition tasks, which exhibits lower active component usage, reduced control complexity, and improved energy efficiency.

show abstract

Section: Design Of Electronic-photonic Chip For Block-circulant Osnnmentioning

confidence: 99%

A hardware-efficient silicon electronic-photonic chip for optical structured neural networks

Ning,

Gu,

Feng

et al. 2024

Optical Interconnects XXIV

Self Cite

View full text Add to dashboard Cite

show abstract

“…The current development of artificial neuromorphic devices mainly includes two technical routes. One is based on the traditional mature CMOS technology of SRAM or DRAM build (Asghar et al, 2021 ), and the prototype device is volatile in terms of information storage; the other is built based on non-volatile Flexible FLASH devices or new memory devices and new materials (Feng et al, 2021 ; He et al, 2021 ). Non-volatile neuromorphic devices are memristors with artificial neuromorphic characteristics and unique nonlinear properties that have become new basic information processing units that mimic biological neurons and synapses (Yang et al, 2013 ; Prezioso et al, 2015 ).…”

Section: Prospectsmentioning

confidence: 99%

An overview of brain-like computing: Architecture, applications, and future trends

Xiao²,

Zhu³

et al. 2022

Front. Neurorobot.

View full text Add to dashboard Cite

With the development of technology, Moore's law will come to an end, and scientists are trying to find a new way out in brain-like computing. But we still know very little about how the brain works. At the present stage of research, brain-like models are all structured to mimic the brain in order to achieve some of the brain's functions, and then continue to improve the theories and models. This article summarizes the important progress and status of brain-like computing, summarizes the generally accepted and feasible brain-like computing models, introduces, analyzes, and compares the more mature brain-like computing chips, outlines the attempts and challenges of brain-like computing applications at this stage, and looks forward to the future development of brain-like computing. It is hoped that the summarized results will help relevant researchers and practitioners to quickly grasp the research progress in the field of brain-like computing and acquire the application methods and related knowledge in this field.

show abstract

“…Various integrated photonic tensor core (PTC) designs have been introduced and demonstrated for ultra-fast photonic analog linear operation acceleration. Coherent PTCs that leverage interference and diffraction include MZI arrays, 1 butterfly-style meshes, 2,3 auto-designed photonic circuits, 4 coupler-crossbar array, 5 star-coupler-based design, 6 and metalens-based diffractive PTCs, 7 etc. Besides, to leverage the wavelength-division multiplexing (WDM) technique, there are incoherent multi-wavelength PTCs, e.g., MRR weight bank, [8][9][10][11] PCM crossbar arrays, 12 micro-comb-based computing engine.…”

mentioning

confidence: 99%

“…A versatile/generic photonic accelerator based on universal optical linear units is capable of realizing general matrix multiplication (GEMM) and thus directly implementing a wide spectrum of pretrained digital DNNs. Many specialized linear units are not applicable to generic tensor computation since they restrict their matrix expressivity to a subspace of specialized matrices for higher hardware efficiency, e.g., butterfly meshes 3 and tensorized MZI arrays. 15 Besides versatility, photonic computing requires real-time, efficient input tensor encoding with low reconfiguration costs.…”

mentioning

confidence: 99%

“…Similarly, many subspace linear unit designs can approximate GEMM operations by cascading more programmable devices but require an even more costly optimization-based approach to map the weight matrix. 3,6 Such a property restricts those designs to only support weight-static linear operations, e.g., fully connected (FC) layers and convolutional (CONV) layers, where weights are pretrained and pre-encoded into the device/circuit transmissions. However, advanced AI models, e.g., Transformer [16][17][18][19][20][21] based on attention operations where both matrix multiplication operands are dynamic, full-range, and general tensors, cannot be efficiently mapped to those weight-static PTCs.…”

mentioning

confidence: 99%

See 1 more Smart Citation

TeMPO: Efficient time-multiplexed dynamic photonic tensor core for edge AI with compact slow-light electro-optic modulator

Zhang,

Yin,

Gangi

et al. 2024

Journal of Applied Physics

Self Cite

View full text Add to dashboard Cite

Electronic–photonic computing systems offer immense potential in energy-efficient artificial intelligence (AI) acceleration tasks due to the superior computing speed and efficiency of optics, especially for real-time, low-energy deep neural network inference tasks on resource-restricted edge platforms. However, current optical neural accelerators based on foundry-available devices and conventional system architecture still encounter a performance gap compared to highly customized electronic counterparts. To bridge the performance gap due to lack of domain specialization, we present a time-multiplexed dynamic photonic tensor accelerator, dubbed TeMPO, with cross-layer device/circuit/architecture customization. At the device level, we present foundry-compatible, customized photonic devices, including a slow-light electro-optic modulator with experimental demonstration, optical splitters, and phase shifters that significantly reduce the footprint and power in input encoding and dot-product calculation. At the circuit level, partial products are hierarchically accumulated via parallel photocurrent aggregation, lightweight capacitive temporal integration, and sequential digital summation, considerably relieving the analog-to-digital conversion bottleneck. We also employ a multi-tile, multi-core architecture to maximize hardware sharing for higher efficiency. Across diverse edge AI workloads, TeMPO delivers digital-comparable task accuracy with superior quantization/noise tolerance. We achieve a 368.6 TOPS peak performance, 22.3 TOPS/W energy efficiency, and 1.2 TOPS/mm2 compute density, pushing the Pareto frontier in edge AI hardware. This work signifies the power of cross-layer co-design and domain-specific customization, paving the way for future electronic–photonic accelerators with even greater performance and efficiency.

show abstract

A Compact Butterfly-Style Silicon Photonic–Electronic Neural Chip for Hardware-Efficient Deep Learning

Cited by 27 publications

References 64 publications

A hardware-efficient silicon electronic-photonic chip for optical structured neural networks

A hardware-efficient silicon electronic-photonic chip for optical structured neural networks

An overview of brain-like computing: Architecture, applications, and future trends

TeMPO: Efficient time-multiplexed dynamic photonic tensor core for edge AI with compact slow-light electro-optic modulator

Contact Info

Product

Resources

About