The airport industry is a highly competitive market that has expanded quickly during the last two decades. Airport management usually measures the level of passenger satisfaction by applying the traditional methods, such as user surveys and expert opinions, which require time and effort to analyse. Recently, there has been considerable attention on employing machine learning techniques and sentiment analysis for measuring the level of passenger satisfaction. Sentiment analysis can be implemented using a range of different methods. However, it is still uncertain which techniques are better suited for recognising the sentiment for a particular subject domain or dataset. In this paper, we analyse the sentiment of air travellers using five different algorithms, namely Logistic Regression, XGBoost, Support Vector Machine, Random Forest and Naïve Bayes. We obtain our data set through the SKYTRAX website which is a collection of reviews of around 600 airports. We apply some pre-processing steps, such as converting the textual reviews into numerical form, by using the term frequencyinverse document frequency. We also remove stopwords from the text using the NLTK list of stopwords. We evaluate our results using the accuracy, precision, recall and F1_score performance metrics. Our analysis shows that XGBoost provides the most accurate results when compared with other algorithms.
Flight delays have negatively impacted the socio-economics state of passengers, airlines and airports, resulting in huge economic losses. Hence, it has become necessary to correctly predict their occurrences in the process of decision-making because it is important for the effective management of the aviation industry. Developing accurate flight delays classification models depends mostly on the air transportation system complexity and the infrastructure available in airports, which may be a region-specific issue. However, no specific prediction or classification model can handle the individual characteristics of all airlines and airports at the same time. Hence, the need to further develop and compare predictive models for aviation decision system of the future cannot be over-emphasized. In this research, flight on-time data records from the United State Bureau of Transportation Statistics was employed to evaluate the performances of Deep Feedforward Neural Network, Neural Network, and Support Vector Machine models on a binary classification problem. The research revealed that different accuracies of flight delay classifications were achieved by the models. The Support Vector Machine had the worst average accuracy than Neural Network and Deep Feedforward Neural Network in the initial experiment. The Deep Feedforward Neural Network outperformed Support Vector Machines and Neural Network with best average percentage accuracies. Going further to investigate the Deep Feedforward Neural Network architecture on different parameters against itself suggest that training a Deep Feedforward Neural Network algorithm, regardless of data training size, the classification accuracy peaks. We examine which number of epochs works best in our flight delay classification settings for the Deep Feedforward Neural Network. Our experiment results demonstrate that having many epochs affects the convergence rate of the model unlike when hidden layers are increased, it does not ensure better or higher accuracy in a binary classification of flight delays. Finally, we recommended further studies on the applicability of Deep Feedforward Neural Network in flight delays prediction with specific case studies of either airlines or airports to check the impact on the performance of the model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.