Molding injects a molding compound into a mold to form a protective shell around the wafer. During the injection process, overflow may occur, leading to mold flash, which reduces yield and causes significant manufacturing cost losses. This paper proposes a deep-learning-based method for detecting and predicting the occurrence of mold flash probability to address this issue. First, the paper conducts random forest importance analysis and correlation analysis to identify the key parameters that significantly impact mold flash. This paper uses these key parameters as input signals for the prediction model. The paper introduces an HLGA Transformer to construct an ensemble meta-learning model that predicts the probability of molding defects, achieving a prediction accuracy of 98.16%. The ensemble meta-learning approach proposed in this paper outperforms other methods in terms of performance. The model predictions can be communicated to the system in real time, allowing it to promptly adjust critical machine operation parameters, thereby significantly improving the molding process yield and reducing substantial manufacturing cost losses.