Delays in legal proceedings significantly impact both corporate finances and individual livelihoods. Traditional methods for managing these delays typically rely on subjective assessments of what constitutes a reasonable process duration. This study explores a more precise approach by integrating machine learning and process mining techniques to enhance prediction of legal proceedings’ overall time. Diverging from previous works that either utilized machine learning analysis or process mining in isolation, this research synergizes these approaches. We applied process mining clustering techniques to over 60,000 cases from Brazilian labor courts to segment cases based on their procedural movements, creating clusters. These clusters, along with other procedural characteristics, such as case subject, class, and digital status, were then incorporated into a feature set for regression modelling. We employed linear regression, support vector regressor, and gradient boosting based methods to develop models that predicted case duration. The gradient boosting model demonstrated the best performance with an $$R^2$$
R
2
-score of 0.87. Furthermore, our analysis identifies time bands where the model performs better and employs explainable AI techniques to elucidate key features influencing case durations. The clustering features emerged among the most significant for the task. The proposed combined approach offers a comprehensive method for analyzing and forecasting legal case timelines and also shows the potential of process mining clustering techniques to improve the analysis.