metasurface research has spawned new types of flat optical components including beam deflectors, [4] high-quality-factor diffractors, [5] wave plates, [6] lenses, [7,8] and holograms. [9,10] Compared with their bulky counterparts that often rely on the phase accumulation of light propagating through conventional media, metasurface-based components can efficiently manipulate light by a deep subwavelength interface, making them compact, easy for integration, and relatively low loss, especially when dielectric materials are used. [11] In addition, the large flexibility in metasurface design offers unique ability to control light for multipurposed tasks that are usually inaccessible based on natural materials, yielding prescribed optical responses to different combinations of light characteristics such as the frequency, phase, angle of incidence, and polarization. [12][13][14][15][16][17][18][19][20] The exotic optical properties of metasurfaces originate from the engineered light-structure interactions. The conventional design approach heavily relies on template-based parameter sweep via numerical simulations to find eligible meta-atom structures, which could be facilitated by physical principles such as plasmonic resonance, [21] Mie theory, [22] or Pancharatnam-Berry (PB) phase for circular polarized light. [23] In the case of multifunctional metasurfaces, the design routines often utilize an empirical decoupling of the meta-atom response using multimode resonance in dielectric pillars, [24] As 2D metamaterials, metasurfaces provide an unprecedented means to manipulate light with the ability to multiplex different functionalities in a single planar device. Currently, most pursuits of multifunctional metasurfaces resort to empirically accommodating more functionalities at the cost of increasing structural complexity, with little effort to investigate the intrinsic restrictions of given meta-atoms and thus the ultimate limits in the design. In this work, it is proposed to embed machine-learning models in both gradientbased and nongradient optimization loops for the automatic implementation of multifunctional metasurfaces. Fundamentally different from the traditional two-step approach that separates phase retrieval and meta-atom structural design, the proposed end-to-end framework facilitates full exploitation of the prescribed design space and pushes the multifunctional design capacity to its physical limit. With a single-layer structure that can be readily fabricated, metasurface focusing lenses and holograms are experimentally demonstrated in the near-infrared region. They show up to eight controllable responses subjected to different combinations of working frequencies and linear polarization states, which are unachievable by the conventional physics-guided approaches. These results manifest the superior capability of the data-driven scheme for photonic design, and will accelerate the development of complex devices and systems for optical display, communication, and computing.