One of the most important issues in Transhipment Container Terminal (TCT) management is to have fairly reliable and affordable predictions about vessel arrival. Terminal operators need to estimate the actual time of arrival in port in order to determine the daily demand for each work shift with greater accuracy. In this way, the resources required (human resources, equipment as well as spatial resources) can be allocated more efficiently. Despite contractual obligations to notify the Estimated Time of Arrival (ETA) 24 hours before arrival, ship operators often have to revise it due to unexpected events like weather conditions, delay in a previous port and so on. For planners the decision-making processes related to this topic can sometimes be so complex without the support of suitable methodological tools. Specific models should be adopted, in a daily planning scenario, to provide a useful support tool in TCTs. In this study, we discuss an exploratory analysis of the data affecting delays registered at a Mediterranean TCT. We present some preliminary results obtained using data mining techniques and propose a Classification and Regression Trees (CART) model to reduce the range of uncertainty of ship arrivals in port. This approach is compulsory to manage vast amounts of unstructured data involved in estimating of vessel arrivals. Reference to this paper should be made as follows: Pani, C.; Fadda, P.; Fancello, G.; Frigau, L.; Mola, F. 2014. A data mining approach to forecast late arrivals in a transhipment container terminal, Transport 29(2): 175-184. http://dx.