Gearboxes are a crucial component of the transmission systems in many devices. Due to prolonged operation and high loads, it is inevitable that their condition will degrade over time. Therefore, intelligently dividing health state stages and conducting timely and effective health state assessments are essential for ensuring the safe and reliable operation of gearboxes. In response to the acoustic signals generated during the operation of gearboxes, a health state division and assessment method based on acoustic signals is proposed. Initially, fast Fourier transform (FFT) is utilized to convert the measured sound signal into a spectral signal that is relatively less disturbed by noise. Subsequently, the temporal convolutional autoencoder (TCAE) is proposed and constructed to encode and decode the spectrum signals at different moments, so that the trained encoder can be used to extract the deep features of the signals adaptively. After that, K‐Means clustering method was used to automatically divide the health state of the gearbox combined with the extracted deep features. Finally, the one‐dimensional convolutional neural networks (1DCNN) model is constructed and trained, and the deep features extracted by TCAE are input to identify the health state stage of the test sample, so as to realize the health state assessment of the gearbox. The experimental results show that, in the gearbox data set of three working conditions, the proposed method is closer to the health stage of manual calibration, which proves the rationality of the proposed intelligent method. The accuracy of the proposed health assessment method can reach 95%, 90%, and 90%, respectively, and the effect is obviously better than that of the more commonly used models at present, achieving effective health state assessment of the gearbox under non‐destructive testing conditions.