Machine learning and artificial intelligence research has experienced rapid growth in the last two decades. [1] One of the core engines that has driven this growth is deep learning, [2] permitting efficient and rapid training of deep artificial neural network models. The ability to train deep neural networks has revolutionized artificial intelligence, and electronics has been the undisputed platform of choice for implementing artificial neural networks. Specialized processing hardware such as graphics processing unit (GPU) is widely used today for deep learning. However, these electronic processors are powerhungry and bulky, making researchers wary of the environmental impact of machine learning. [3,4] Therefore, there is strong interest in low-power and fast computing platforms for machine learning applications. Optical computing has been identified as a promising potential alternative for such purposes because of the large bandwidth, high speed, and massive parallelism of optics. [5] Diffractive deep neural networks (D 2 NNs), also known as diffractive optical networks or diffractive networks, form a passive all-optical computing platform that exploits the diffraction of light waves to perform computation. [6] These diffractive networks are composed of several spatially engineered surfaces, separated by free-space. The diffractive features/elements of a layer, also termed "diffractive neurons", locally modulate the amplitude and/or the phase of the light incident upon the layer. Successive modulation by and diffraction through the layers give rise to an all-optical transformation between the input and the output fields-of-view at the speed of light propagation without any external power. The amplitude and/or the phase values of the diffractive neurons corresponding to a desired optical transformation or computational task are trained/learned through a digital computer using deep learning. Once the training is complete, the layers can be fabricated and assembled to form a "physical" network that performs the desired computation in a passive manner and at the speed of light propagation. Diffractive networks can achieve universal linear transformations, [7][8][9] and various applications using diffractive processors have been demonstrated such as object classification, pulse processing, imaging through random diffusers, hologram reconstruction, quantitative phase imaging, class-specific imaging, super-resolution image display, all-optical logic