Small-scale preliminary studies are necessary to determine the feasibility of the machine learning (ML) algorithm and time-evolution kinetics to meet the design specification of the treatment unit. The train and test datasets were obtained from jar test experimentation on the petroleum industry effluent (PIE) sample using aluminum sulfate (AS) as the coagulant. The ML algorithm from scikit-learn was employed to determine the optimum operating condition for the removal of colloidal particles, causing turbidity in the PIE. The predictive capacity of four ML models was compared based on their statistical metrics for clean discharge. The predicted optimum condition corresponds to pH (10), dosage (0.1 g/L), and settling time (30 min) which transcends to residual turbidity ≤ 10 NTU and translates to 95% removal efficiency. The second-order AS-sweep flocculation kinetic showed that at the predicted optimum conditions, modeled rate constant of 1.33 × 10−3 L/g.min and flocculation period of 1.2 min reduced the combination of the monomer, dimmer, and trimmer class colloids from an initial 570 mg/L concentration to the residual counts of 24 mg/L corresponding to residual turbidity ≤ 10 NTU under the mixing regime 14 s−1 ≤ G ≤ 164 s−1 satisfied the EPA standard for clean effluent discharge. It incorporated the selected ML output with time-evolution and aggregation kinetics to define sedimentation tank geometry for cleaner discharge. The findings from the design-driven optimization recommended a flow rate (1000 m3s−1), coefficient of kinematic viscosity (0.841 mm/s), and the required detention time (30–60 min) to define the sedimentation tank geometry.