Currently, Pedestrian Dead Reckoning (PDR) systems are becoming more attractive in market of indoor positioning. This is mainly due to the development of cheap and light Micro Electro-Mechanical Systems (MEMS) on smartphones and less requirement of additional infrastructures in indoor areas. However, it still faces the problem of drift accumulation and needs the support from external positioning systems. Vision-aided inertial navigation, as one possible solution to that problem, has become very popular in indoor localization with satisfied performance than individual PDR system. In the literature however, previous studies use fixed platform and the visual tracking uses feature-extraction-based methods. This paper instead contributes a distributed implementation of positioning system and uses deep learning for visual tracking. Meanwhile, as both inertial navigation and optical system can only provide relative positioning information, this paper contributes a method to integrate digital map with real geographical coordinates to supply absolute location. This hybrid system has been tested on two common operation systems of smartphones as iOS and Android, based on corresponded data collection apps respectively, in order to test the robustness of method. It also uses two different ways for calibration, by time synchronization of positions and heading calibration based on time steps. According to the results, localization information collected from both operation systems has been significantly improved after integrating with visual tracking data.