Abstract:Recently, population density has grown quickly with the increasing acceleration of urbanization. At the same time, overcrowded situations are more likely to occur in populous urban areas, increasing the risk of accidents. This paper proposes a synthetic approach to recognize and identify the large pedestrian flow. In particular, a hybrid pedestrian flow detection model was constructed by analyzing real data from major mobile phone operators in China, including information from smartphones and base stations (BS). With the hybrid model, the Log Distance Path Loss (LDPL) model was used to estimate the pedestrian density from raw network data, and retrieve information with the Gaussian Progress (GP) through supervised learning. Temporal-spatial prediction of the pedestrian data was carried out with Machine Learning (ML) approaches. Finally, a case study of a real Central Business District (CBD) scenario in Shanghai, China using records of millions of cell phone users was conducted. The results showed that the new approach significantly increases the utility and capacity of the mobile network. A more reasonable overcrowding detection and alert system can be developed to improve safety in subway lines and other hotspot landmark areas, such as the Bundle, People's Square or Disneyland, where a large passenger flow generally exists.