In this paper, a commercial high-accuracy, low-cost and real-time indoor building-level localization system is proposed, which is applicable for locating the Minimization of Drive-Tests (MDT) data in the long-term-evolution (LTE) cellular communication network system. The system utilizes MDT data containing Global Navigation Satellite Systems (GNSS) information which is easy to collect and low cost to assist indoor localization, instead of using indoor drive test (DT) data which needs high manual collection and maintenance costs. In order to compensate for the loss of location accuracy, this paper innovatively divide the online process into two phases: indoor and outdoor (IO) identification phase and indoor localization phase. A real-time and precise GMM-based unsupervised algorithm is applied to identifying if the non-GNSS MDT data is in indoor environment in IO identification phase. Then, a multi-class classification algorithm based on Bayesian classifier is used to locate indoor MDT data to the specific building. The results of experiments conducted in an in-service LTE network using more than 100 LTE base stations demonstrate that the proposed technique yields a IO identification accuracy of 90% and an indoor location accuracy of 49.3m(@67%) respectively, which provides at least 30.2% enhancement in location accuracy compared to traditional technology without DT fingerprint database. This proposed indoor localization technique is applicable in network optimization and Operation and Maintenance (O&M) to assist communication service providers to reduce their operating expense (OPEX) by locating those MDT data without GNSS information. INDEX TERMS Long-term-evolution, indoor localization, indoor and outdoor identification, Bayesian classifier, Gaussian mixture model, penetration loss.