<p>Convolutional neural networks
(CNNs), inspired by biological visual cortex systems, are a powerful category
of artificial neural networks that can extract the hierarchical features of raw
data to greatly reduce the network parametric complexity and enhance the
predicting accuracy. They are of significant interest for machine learning
tasks such as computer vision, speech recognition, playing board games and
medical diagnosis [1-7]. Optical neural networks offer the promise of
dramatically accelerating computing speed to overcome the inherent bandwidth
bottleneck of electronics. Here, we demonstrate a universal optical vector
convolutional accelerator operating beyond 10 Tera-OPS (TOPS - operations per
second), generating convolutions of images of 250,000 pixels with 8-bit
resolution for 10 kernels simultaneously — enough for facial image recognition.
We then use the same hardware to sequentially form a deep optical CNN with ten
output neurons, achieving successful recognition of full 10 digits with 900 pixel
handwritten digit images with 88% accuracy. Our results are based on
simultaneously interleaving temporal, wavelength and spatial dimensions enabled
by an integrated microcomb source. This approach is scalable and trainable to
much more complex networks for demanding applications such as unmanned vehicle
and real-time video recognition. <i> </i></p>