Public transport has become one of the major transport options, especially when it comes to reducing motorized individual transport and achieving sustainability while reducing emissions, noise and so on. The use of public transport data has evolved and rapidly improved over the past decades. Indeed, the availability of data from different sources, coupled with advances in analytical and predictive approaches, has contributed to increased attention being paid to the exploitation of available data to improve public transport service. In this paper, we review the current state of the art of public transport data sources. More precisely, we summarize and analyze the potential and challenges of the main data sources. In addition, we show the complementary aspects of these data sources and how to merge them to broaden their contributions and face their challenges. This is complemented by an information management framework to enhance the use of data sources. Specifically, we seek to bridge the gap between traditional data sources and recent ones, present a unified overview of them and show how they can all leverage recent advances in data-driven methods and how they can help achieve a balance between transit service and passenger behavior.