This paper proposes a new approach, i.e., virtual pooling, for optimising returnable transport item (RTI) flows in a two-level closed-loop supply chain. The supply chain comprises a set of suppliers delivering their products loaded on RTIs to a set of customers. RTIs are of various types. The objective is to model a deterministic, multi-supplier, multi-customer inventory routing problem with pickup and delivery of multi-RTI. The model includes inventory-level constraints, the availability of empty RTIs to suppliers, and the minimisation of the total cost, including inventory holding, screening, maintenance, transportation, sharing, and purchasing costs for new RTIs. Furthermore, suppliers with common customers coordinate to virtually pool their inventory of empty RTIs held by customers so that, when loaded RTIs are delivered to customers, each may benefit from this visit to pick up the empty RTI, regardless of the ownership. To handle the combinatorial complexity of the model, a new artificial-immune-system-based algorithm coupled with deep reinforcement learning is proposed. The algorithm combines artificial immune systems’ strong global search ability and a strong self-adaptability ability into a goal-driven performance enhanced by deep reinforcement learning, all tailored to the suggested mathematical model. Computational experiments on randomly generated instances highlight the performance of the proposed approach. From a managerial point of view, the results stress that this new approach allows for economies of scale and cost reduction at the level of all involved parties to about 40%. In addition, a sensitivity analysis on the unit cost of transportation and the procurement of new RTIs is conducted, highlighting the benefits and limits of the proposed model compared to dedicated and physical pooling modes.