Booking hotels online is now a very common way for people to travel and stay, but a large number of cancellations due to itinerary changes and other factors can have a big impact on hotels, such as losing customers who really need a certain room type and losing them to other hotels. In order to reduce hotel losses, this paper uses the data of two hotels through data published on Kaggle's official website, identifies the factors that have the greatest impact on hotel cancellations through EDA visualization, and gives improvement measures. Machine learning algorithms are then used to guess whether the customer will cancel the booking. Each algorithm has its own area of expertise, so this article makes a comparison to the performance of decision trees, logistic regression, random forests. The result is that random forests have the highest accuracy and hotel managers can use the model to predict and change business strategies to increase profits.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.