In the context of Iraq’s evolving transportation landscape and the strategic implications of the Belt and Road Initiative, this study pioneers a comprehensive framework for optimizing multimodal transportation systems. The study implemented a decision-making framework for multimodal transportation, combining data envelopment analysis (DEA) efficiency scores and a Markov decision process (MDP) to optimize transportation strategies. The DEA scores captured decision-making unit (DMU) performance in various aspects, while the MDP rewards facilitated strategic mode selection, promoting efficiency, cost-effectiveness, and environmental considerations. Although our method incurs a total cost approximately 29% higher than MRMQoS, it delivers a nearly 26% reduction in delay compared to MCSTM. Despite MRMQoS yielding an 8.3% higher profit than our approach, our proposed scheme exhibits an 11.7% higher profit compared to MCSTM. In terms of computational time, our method achieves an average CPU time positioned between MCSTM and MRMQoS, with MCSTM showing about 1.6% better CPU time than our approach, while our method displays a 9.5% improvement in computational time compared to MRMQoS. Additionally, concerning CO2 emissions, the proposed model consistently outperforms other models across various network sizes. The percentage decrease in CO2 emissions achieved by the proposed model is 7.26% and 31.25% when compared against MRMQoS and MCSTM for a network size of 25, respectively.