Base-isolated structures can effectively control structural response, of which rubber bearings are key components. To ensure the proper functioning of rubber bearings, it is important to accurately detect their state. In previous studies, a smart rubber bearing has been proposed to monitor the axial pressure, shear deformation, and rupture damage of rubber bearings. However, owing to the small dataset and limited ability of the prediction model, two problems exist in previous research: (1) the axial pressure, shear deformation, and rupture damage cannot be detected simultaneously and (2) the detection accuracy is not sufficiently high for practical applications. This study solves these problems in three aspects. First, it addresses the issue of limited data by researching and comparing three data augmentation methods. Second, a multitask framework with a self-attention mechanism is established to simultaneously detect axial compression, shear deformation, and rupture damage. Finally, we introduce a two-stage optimization method based on Bayesian optimization and a grid search for the network structure and optimization. The root-mean-square errors for predicting the axial pressure and shear deformation in the test set are 0.19 MPa and 2.04%, which reduced the error by up to 56.8% and 73.0%, respectively, compared to conventional deep neural network models. In addition, the accuracy of rupture damage detection reached 100%.