Recent advances in Indoor Positioning Systems led to a business interest in those applications and services where a precise localization is crucial. Wi-Fi fingerprinting based on Machine Learning and Expert Systems are commonly used in the literature. They compare a current fingerprint to a database of fingerprints, and then return the most similar one/ones according to: 1) a distance function, 2) a data representation method for Received Signal Strength values, and 3) a thresholding strategy. However, most of the previous works simply use the Euclidean distance with the raw unprocessed data.There is not any previous work that studies which is the best distance function, which is the best way of representing the data and which is the effect of applying thresholding.In this paper, we present a comprehensive study using 51 distance metrics, 4 alternatives to represent the raw data (2 of them proposed by us), a thresholding based on the RSS values and the public UJIIndoorLoc database. The results shown in this paper demonstrates that researchers and developers should take into account the conclusions arisen in this work in order to improve the accuracy of their systems. The IPSs based on k-NN are improved by just selecting the appropriate configuration (mainly distance function and data representation). In the best case, 13-NN with Sørensen distance and the powed data representation, the error in determining the place (building and floor)
Preprint submitted to ElsevierNovember 23, 2015 has been reduced in more than a 50% and the positioning accuracy has been increased in 1.7 meters with respect to the 1-NN with Euclidean distance and raw data commonly used in the literature. Moreover, our experiments also demonstrates that thresholdingshould not be applied in multi-building and multi-floor environments