2020
DOI: 10.3390/s20071992
|View full text |Cite
|
Sign up to set email alerts
|

An Integrated Fuzzy C-Means Method for Missing Data Imputation Using Taxi GPS Data

Abstract: Various traffic-sensing technologies have been employed to facilitate traffic control. Due to certain factors, e.g., malfunctioning devices and artificial mistakes, missing values typically occur in the Intelligent Transportation System (ITS) sensing datasets, resulting in a decrease in the data quality. In this study, an integrated imputation algorithm based on fuzzy C-means (FCM) and the genetic algorithm (GA) is proposed to improve the accuracy of the estimated values. The GA is applied to optimize the para… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0
1

Year Published

2021
2021
2022
2022

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(11 citation statements)
references
References 46 publications
0
10
0
1
Order By: Relevance
“…A variety of methods have been proposed in literature for GPS data imputation. [19] presented a Gaussian process based approach, [20] presented a time-series approach named missForest, while [21] put forth a fuzzy c-means method that is tested with GPS data from taxis. When limited data in terms of density and granularity is available, simpler methods can be considered such as linear interpolation where two consecutive points are joint by a straight line, or splines which are piecewise functions of polynomials allowing to connect two consecutive points by a smooth curve [22].…”
Section: A Gps Signal Losses and Solutionsmentioning
confidence: 99%
“…A variety of methods have been proposed in literature for GPS data imputation. [19] presented a Gaussian process based approach, [20] presented a time-series approach named missForest, while [21] put forth a fuzzy c-means method that is tested with GPS data from taxis. When limited data in terms of density and granularity is available, simpler methods can be considered such as linear interpolation where two consecutive points are joint by a straight line, or splines which are piecewise functions of polynomials allowing to connect two consecutive points by a smooth curve [22].…”
Section: A Gps Signal Losses and Solutionsmentioning
confidence: 99%
“…In recent years, researchers have tried to solve this problem using machine learning methods. [17] have used population genetic algorithms for the shortest path calculation to implement a taxi dispatching model and also to recommend the best area for taxi drivers to carry passengers. [18] develop a taxi path optimization model and solve the taxi path optimization model by using an improved genetic algorithm.…”
Section: Related Workmentioning
confidence: 99%
“…In spatial graph, we define the regional grids as an undirected graph G s = ðV s , E s , A s Þ, where V s is the centre point of each grid region, E is the distance between the centre points of each grid region, and the centre point of each grid is regarded as the geographic location centre of the grid. [14] Time series methods Taxi-RS [15] ARIMA [16] Machine learning methods Genetic algorithm PGA [17] HGA [18] LSTM ConvLSTM [19] LSTM and GRU [20] Deep learning methods GCN GCN [21,22] ST-ED-RMGC [23] Attention mechanism MRA [24] 4…”
Section: Definitions and Preliminariesmentioning
confidence: 99%
“…In addition to the abovementioned imputation method based on the KNN technology, the mean imputation (MI) and fuzzy c-means imputation (FCMI) methods have obtained signifcant research progress [31][32][33]. In MI [34], the missing data are estimated by the mean value or mode of the corresponding attribute, and it is used for the data sets with a similar attribute distribution in each category.…”
Section: Related Workmentioning
confidence: 99%
“…However, the estimations of the same attribute in different incomplete patterns are equal. In FCMI [ 33 ], the estimations are calculated by the clustering centers and the distance between the centers and the patterns. However, the performance of this imputation strategy depends on initial conditions.…”
Section: Related Workmentioning
confidence: 99%