This paper introduces a comprehensive, data-driven method to predict the properties of composite materials, such as thermo-mechanical properties, moisture saturation level, durability, or other such important behavior. The approach is based on applying data mining techniques to the collective knowledge in the materials field. In this article, first, a comprehensive database is compiled from published research articles. Second, the Random Forests algorithm is used to build a predictive model that explains the investigated material response based on a wide variety of material and process variables (of different data types). This advanced statistical learning approach has the potential to drastically enhance the design of composite materials by selecting appropriate constituents and process parameters in order to optimize the response for a specific application. This method is demonstrated by predicting the moisture saturation level for vinylester-based composite laminates. Using 90% of the available published data available as the training dataset, the Random Forests algorithm is used to develop a regression model for the moisture saturation level. Variables considered by the model include the manufacturing process, the fiber type and architecture, the fiber and void contents, the matrix filler type and content, as well as the conditioning environment and temperature. On this training data, the model proved to be a good fit with a prediction accuracy of R 2 training=94.96%. When used to predict the moisture saturation level for the remaining unseen 10% of the compiled data, the model exhibited a prediction accuracy of R 2 test=85.28%. Furthermore, the Random Forests model allows the assessment of the impact of the different variables on the moisture saturation level. The fiber type is found to be the most important determinant on the moisture saturation level in vinylester composite laminates.