While traffic volume data from loop detectors have been the common data source for link flow estimation, the detectors only cover a subset of links. These days, other data sources such as vehicle trajectory data collected from vehicle tracking sensors are also incorporated. However, trajectory data are often sparse in that the observed trajectories only represent a small subset of the whole population, where the exact route sampling rate is unknown and may vary over space and time. In this paper, we develop a method that leverage these two limited data sources to enhance link flow estimation. This study proposes a novel generative modelling framework, where we formulate a vehicle's link-to-link movements as a sequential decision-making problem using the Markov Decision Process framework. We propose an Inverse Reinforcement Learning-based method, based on which synthetic population vehicle trajectories can be generated to estimate link flows across the whole network. The proposed method ensures the generated population vehicle trajectories are consistent with the observed traffic volume and trajectory data. The proposed generative modelling framework is compared to two existing methods in a synthetic road network and validated in a real road network.