controllability of light due to the limited flexibility rendered in periodic metastructures of simple unit cells. To overcome these deficiencies, metasurfaces comprised of multiple meta-atoms, such as gradient and multilayered metasurfaces, have been proposed and developed. [7][8][9] Relying on the collective effects of multiple meta-atoms, these metasurfaces present intriguing properties such as anomalous deflection, [7,10] arbitrary phase control, asymmetric polarization conversion, [8,11] wave-front shaping, [12][13][14] etc., which brings about extensive applications for imaging, optical signal processing, emission control, and much more. Here in our following discussion, we refer to unit cells composed of various meta-atoms as metamolecules, analogous to the hierarchical relationship between atoms and molecules in nature. In our definition of a metamolecule, we assume every two adjacent meta-atoms are not strongly coupled, in which case the overall properties of the metamolecule can be analytically predicted by the properties of its constituent meta-atoms. Such an assumption is valid in most metasurfaces that consist discrete, spatially variant building blocks.Despite the extraordinary properties of metasurfaces made up of metamolecules, designing multiple meta-atoms that collectively function as a device is a time-consuming task that requires labor-intensive trial-and-error simulations. The difficulty of the inverse design of such metamolecules arises from the intricate mechanisms of multistructured systems, the vast number of possible combinations of distinct meta-atoms, as well as the expensive 3D full wave simulations required. Traditionally, a practical solution to such a design follows three steps: 1) specifying a class of geometry with a few parameters as candidate meta-atoms, 2) carrying out parametric sweeps on these parameters, and 3) enumerating possible combinations of meta-atoms to meet the design objective. However, the limitation of the geometry in the strategy largely restricts the variety of the shapes of meta-atoms, which usually does not lead to an optimal solution, even after extensive and expensive simulations.Alongside the evolution of nanophotonics, various methods for expediting the design of photonic structures have been developed. Gradient-based adjoint methods, such as topology optimization, are a class of widely applied approaches for Molecules composed of atoms exhibit properties not inherent to their constituent atoms. Similarly, metamolecules consisting of multiple meta-atoms possess emerging features that the meta-atoms themselves do not possess. Metasurfaces composed of metamolecules with spatially variant building blocks, such as gradient metasurfaces, are drawing substantial attention due to their unconventional controllability of the amplitude, phase, and frequency of light. However, the intricate mechanisms and the large degrees of freedom of the multielement systems impede an effective strategy for the design and optimization of metamolecules. Here, a hybrid artificial-i...