A fast and accurate bauxite recognition method combining an attention module and a clustering algorithm is proposed in this paper. By introducing the K-means clustering algorithm into the YOLOv4 network and embedding the SE attention module, we calculate the corresponding anchor box value, enhance the feature learning ability of the network to bauxite, automatically learn the importance of different channel features, and improve the accuracy of bauxite target detection. In the experiment, 2189 bauxite photos were taken and screened as the target detection datasets, and the targets were divided into four categories: No. 55, No. 65, No. 70, and Nos. 72–73. By selecting the category volume balanced datasets, the optimal YOLOv4 network model was obtained after training 7000 times, so that the average accuracy of bauxite sorting reached 99%, and the reasoning speed was better than 0.05 s. Realizing the high-speed and high-precision sorting of bauxite greatly improves the mining efficiency and accuracy of the bauxite industry. At the same time, the model provides key technical support for the practical application of the same type of engineering.