mostly relies on physics-inspired methods, resorting to human knowledge such as physical insights revealed by simplified analytical modeling, similar experience transferred from previous practice, and intuition obtained by scientific reasoning. For example, many meta-atoms inherited traditional antenna designs with geometries like rectangle, [4] cross, [5] bowtie, [6] V-shape, [7] H-shape, [8] and so on, whose first-order response is approximated by electrical dipole resonance with relevant scaling effect. [9] Some other designs guided by physical intuition include ring-like structures that exhibit strong magnetic resonances induced by the incident magnetic field, [10][11][12] dielectric building blocks that can induce both electric and magnetic resonances leading to better control of the phase of the scattered light, [13][14][15] or the spectra line-shape tailoring by introducing coupling among different resonant modes. [16][17][18] Despite the exciting results obtained by these physics-inspired designs, this methodology basically relies on a trial-and-error process, usually involving numerical methods like finite-difference-time-domain (FDTD) or finite element method (FEM) to iteratively solve Maxwell's equations. The low efficiency and thus limited exploration of the design varieties tend to easily omit the optimal solution. The inverse design approaches start from the opposite end, and try to optimize certain objective functions describing the desired performance. [19,20] Common approaches for inverse problems include genetic algorithm, [21] level set methods, [22] and topology optimization, [23] which, however, are still stochastic searching algorithms that are time-consuming and deteriorate rapidly as the design space grows. Different from numerical calculations, data-driven methods based on machine learning (ML) solve the optimization problem from statistical perspectives, so that the solution to optimize a target can be approximately generalized from numerous design examples. With the rapidly accumulated data and thus booming of deep learning (DL), the state-of-theart in many research domains, such as speech recognition, [24] computer vision, [25,26] natural language processing, [27] and decision making, [28] has been pushed far beyond conventional methods. Deep neural networks simulate biological signal processing that allow computational models to learn multiple The research of metamaterials has achieved enormous success in the manipulation of light in a prescribed manner using delicately designed subwavelength structures, so-called meta-atoms. Even though modern numerical methods allow for the accurate calculation of the optical response of complex structures, the inverse design of metamaterials, which aims to retrieve the optimal structure according to given requirements, is still a challenging task owing to the nonintuitive and nonunique relationship between physical structures and optical responses. To better unveil this implicit relationship and thus facilitate metamaterial designs, it is proposed ...