Hotel booking cancellation prediction is crucial in conducting revenue and resource management for hotels. This paper provides three possible substitutes for the neural network including logistic regression, k-Nearest Neighbor (k-NN), and CatBoost, whereas CatBoost, is the most suitable model for hotels to do the prediction. The advantages of them are effectiveness, high accuracy, and lower cost. The dataset used in this paper was adapted from Kaggle, a set of the booking data from two types of hotels (resort hotel and city hotel) in Portugal, and the corresponding customers' information. We select some key variables as the predictor to train and test the prediction models based on three machine learning algorithms. After preprocessing the raw data, i.e., standardizing, dealing with missing data, recoding some variables, and scaling, we conduct the prediction and compare each model through three metrics (confusion matrix, accuracy score, and 1 F -score). The result indicates that CatBoost has the best performance in predicting hotel booking cancellation because it has the greatest number of correct prediction samples and the highest accuracy score. We focus on the efficiency and economy of doing cancellation prediction in the hospitality industry to form a basis for future revenue and resource management for hotels.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.