Tactile rendering is a promising technology that is necessary to integrate into virtual reality, augmented reality, mixed reality, and even metaverse environments. One of the key technologies for realizing tactile rendering is a reproduction or a display of tactile sensation. This study developed a model that generates an appropriate input signal to an ultrasonic tactile display using a conditional generative adversarial network. Sensory evaluation scores and vibration data acquired by a tactile sensor were used as training data for the conditional generative adversarial network-based models. In this study, different cluster analysis conditions were used to create the input information for the models. Each model generated the input signals for an ultrasonic tactile display, and the accuracy of the models was evaluated through sensory evaluation experiments. The results showed that model accuracy improved with moderate cluster classification and that the reproducibility of tactile sensation created with the models developed in this study was improved when compared with the reproducibility of tactile sensation created without the models.