Physical Internet based supply chains create open, global logistics systems that enable new types of collaboration among participants. The open system allows the logistical examination of vehicle technology innovations such as the platooning concept. This article explores the multiple platoon collaboration. For the reconfiguration of two platoons a heuristic and a reinforcement learning (RL) based models have been developed. To our knowledge, this work is the first attempt to apply an RL-based decision model to solve the problem of controlling platoon cooperation. Vehicle exchange between platoons is provided by a virtual hub. Depending on the various input parameters, the efficiency of the model was examined through numerical examples in terms of the target function based on the transportation cost. Models using platoon reconfiguration are also compared to the cases where no vehicle exchange is implemented. We have found that a reinforcement learning based model provides a more efficient solution for high incoming vehicle numbers and low dispatch interval, although for low vehicle numbers heuristics model performs better.