In a semiconductor fab, wafer lots are processed in complex sequences with re-entrants and parallel machines. It is necessary to ensure smooth wafer lot flows by detecting potential disturbances in a real-time fashion to satisfy the wafer lots' demands. This study aims to identify production factors that significantly affect the system’s throughput level and find the best prediction model. The contributions of this study are as follows: (1) this is the first study that applies machine learning techniques to identify important real-time factors that influence throughput in a semiconductor fab; (2) this study develops a test bed in the Anylogic software environment, based on the Intel minifab layout; and (3) this study proposes a data collection scheme for the production control mechanism. As a result, four models (adaptive boosting, gradient boosting, random forest, decision tree) with the best accuracies are selected, and a scheme to reduce the input data types considered in the models is also proposed. After the reduction, the accuracy of each selected model was more than 97.82%. It was found that data related to the machines’ total idle times, processing steps, and machine E have notable influences on the throughput prediction.