Modern smartphones and location-based services and apps are poised to transform our daily life. However, current smartphone-based localization solutions are limited mainly to outdoor, mostly missing practical, robust and accurate indoor location solutions. Despite significant efforts on indoor localization in both academia and industry in the past two decades, highly accurate and practical smartphone-based indoor localization remains an open problem. To enable indoor location-based services (ILBS), e.g., step-by-step navigation for the Blind and visually impaired, there are several stringent requirements: highly accurate (foot-level); no additional hardware components or extensions on users' smartphones; scalable to massive concurrent users. Current GPS, Radio RSS (e.g. WiFi, Bluetooth, ZigBee), or Fingerprinting based solutions can only achieve meter-level or room-level accuracy. In this paper, we propose a practical and accurate solution that fills the long-lasting gap of smartphonebased fine-grained indoor localization. Specifically, we design and implement an indoor localization ecosystem Guoguo. Guoguo consists of an anchor network with a coordination protocol to transmit modulated localization beacons using high-band acoustic signals, a realtime processing app in a smartphone, and a backend server for indoor contexts and location-based services. We further propose approaches to improve its coverage, accuracy, and location update rate with low-power consumption. Our prototype shows centimeter-level localization accuracy in several typical indoor environments. Such precise indoor localization is expected to have high impact in the future ILBS and our daily activities.