a research institute, center for petroleum and minerals, King fahd university of petroleum and minerals, Dhahran, Saudi arabia; b Department of Systems Engineering, King fahd university of petroleum and minerals, Dhahran, Saudi arabia; c Department of petroleum Engineering, King fahd university of petroleum and minerals, Dhahran, Saudi arabia ABSTRACT Cross-validation of soft computing techniques needs to be done efficiently to avoid overfitting and underfitting. This is more important in petroleum reservoir characterisation applications where the often-limited training and testing data subsets represent Wells with known and unknown target properties, respectively. Existing data stratification strategies have been haphazardly chosen without any experimental basis. In this study, the optimal training-testing stratification proportions have been rigorously investigated using the prediction of porosity and permeability of petroleum reservoirs as an experimental case. The comparative performances of seven traditional and advanced machine learning techniques were considered. The overall results suggested a recommendable optimum training stratification that could serve as a good reference for researchers in similar applications. 1 now with the Saudi arabian oil company, Exploration and petroleum Engineering center, advanced research center, Dhahran, Saudi arabia. 2 now with the mechanical Engineering Department, new York city college of technology, new York, nY, uSa.