In various engineering fields, bearings are crucial for the operation of rotating machinery. Therefore, the early and precise detection of bearing failures is essential to prevent mechanical issues and maintain optimal machinery performance. This study proposes a fault classification framework based on multi-domain feature extraction, the least absolute shrinkage and selection operator method, long-short term memory, and the self-attention mechanism. Fifteen time-domain, five frequency-domain, and four chaotic-domain features are extracted from the raw data. To validate the model's accuracy and stability, datasets from the Hanoi University of Science and Technology (HUST), a newly published dataset, and Case Western Reserve University (CWRU) were utilized. Experimental validation using open-source bearing datasets demonstrates that the proposed framework can be effectively deployed, highlighting its potential as a fundamental pillar in the field of intelligent manufacturing. The findings show that our model achieves an F1-score of 99.903% for the test set with nine selected features across 24, encompassing all five bearing categories within the HUST dataset. Furthermore, its application to the CWRU dataset yielded comparable metrics, reaching a 98.742% F1-score with eight selected features among 24 features. The objective is to achieve successful prediction outcomes with a reduced number of parameters and to emphasize the significance of incorporating chaotic features into the process for data sets characterized by chaotic processes.