The film industry is one of the most popular entertainment industries and one of the biggest markets for business. Among the contributing factors to this would be the success of a movie in terms of its popularity as well as its box office performance. Hence, we create a comprehensive comparison between the various machine learning models to predict the rate of success of a movie. The effectiveness of these models along with their statistical significance is studied to conclude which of these models is the best predictor. Some insights regarding factors that affect the success of the movies are also found. The models studied include some Regression models, Machine Learning models, a Time Series model and a Neural Network with the Neural Network being the best performing model with an accuracy of about 86%. Additionally, as part of the testing data for the movies released in 2020 are analysed.
Movies are among the most prominent contributors to the global entertainment industry today, and they are among the biggest revenue-generating industries from a commercial standpoint. It's vital to divide films into two categories: successful and unsuccessful. To categorize the movies in this research, a variety of models were utilized, including regression models such as Simple Linear, Multiple Linear, and Logistic Regression, clustering techniques such as SVM and K-Means, Time Series Analysis, and an Artificial Neural Network. The models stated above were compared on a variety of factors, including their accuracy on the training and validation datasets as well as the testing dataset, the availability of new movie characteristics, and a variety of other statistical metrics. During the course of this study, it was discovered that certain characteristics have a greater impact on the likelihood of a film's success than others. For example, the existence of the genre action may have a significant impact on the forecasts, although another genre, such as sport, may not. The testing dataset for the models and classifiers has been taken from the IMDb website for the year 2020. The Artificial Neural Network, with an accuracy of 86 percent, is the best performing model of all the models discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.