Machine learning approaches have been developed rapidly and also they have been involved in many academic findings and discoveries. Additionally, they are widely assessed in numerous industries such as cement companies. Cement companies in developing countries, despite many profits such as valuable mines, face many challenges. Optimization, as a key part of machine learning, has attracted more attention. The main purpose of this paper is to combine a novel Data Envelopment Analysis (DEA) approach in optimization at the first step to find the Decision-Making Unit (DMU) with innovative clustering algorithms in machine learning at the second step introduce the model and algorithm with higher accuracy. At the optimization section with converting two-stage to a simple standard single-stage model, 24 cement companies from five developing countries over 2014–2019 are compared. Window-DEA analysis is used since it leads to increase judgment on the consequences, mainly when applied to small samples followed by allowing year-by-year comparisons of the results. Applying window analysis can be beneficial for managers to expand their comparison and evaluation. To find the most accurate model CCR (Charnes, Cooper and Rhodes model), BBC (Banker, Charnes and Cooper model) and Free Disposal Hull (FDH) DEA model for measuring the efficiency of decision processes are used. FDH model allows the free disposability to construct the production possibility set. At the machine learning section, a novel three-layers data mining filtering pre-processes proposed by expert judgment for clustering algorithms to increase the accuracy and to eliminate unrelated attributes and data. Finally, the most efficient company, best performance model and the most accurate algorithm are introduced. The results indicate that the 22nd company has the highest efficiency score with an efficiency score of 1 for all years. FDH model has the highest efficiency scores during all periods compared with other suggested models. K-means algorithm receives the highest accuracy in all three suggested filtering layers. The BCC and CCR models have the second and third places, respectively. The hierarchical clustering and density-based clustering algorithms have the second and third places, correspondingly.