This study proposes an inverse design framework for metasurfaces based on a neural network capable of generating infinite and continuous latent representations to fully span the electromagnetic metasurfaces (EMMS) property space. The inverse design of EMMS inherently poses the one-to-many mapping problem, since one set of electromagnetic properties can be provided by many different shapes of scatterers. Previous studies have addressed this issue by introducing machine learning-based generative models and regularization strategies. However, most of these approaches require highly complex operating configurations or external modules for preprocessing datasets. In contrast, this study aimed to construct a more streamlined and end-to-end solver by building a network to process multimodal datasets and then incorporating a classification scheme into the network. The validity of the idea was confirmed by comparing the accuracy of the results predicted by the proposed approach and the outcomes simulated using PSSFSS.