A method is proposed for fault classification in milling machines using advanced image processing and machine learning. First, raw data are obtained from real-world industries, representing various fault types (tool, bearing, and gear faults) and normal conditions. These data are converted into two-dimensional continuous wavelet transform (CWT) images for superior time-frequency localization. The images are then augmented to increase dataset diversity using techniques such as rotating, scaling, and flipping. A contrast enhancement filter is applied to highlight key features, thereby improving the model’s learning and fault detection capability. The enhanced images are fed into a modified AlexNet model with three residual blocks to efficiently extract both spatial and temporal features from the CWT images. The modified AlexNet architecture is particularly well-suited to identifying complex patterns associated with different fault types. The deep features are optimized using ant colony optimization to reduce dimensionality while preserving relevant information, ensuring effective feature representation. These optimized features are then classified using a support vector machine, effectively distinguishing between fault types and normal conditions with high accuracy. The proposed method provides significant improvements in fault classification while outperforming state-of-the-art methods. It is thus a promising solution for industrial fault diagnosis and has potential for broader applications in predictive maintenance.