Data mining is a new approach for education. The main objectives of higher education institutions are to provide quality education to its students for their better placement opportunity. We could use Decision tree algorithms to predict student selection in placement. It helps us to identify the dropouts of the student who need special attention and allow the teacher to provide appropriate placement training. This paper describes how the different Decision tree algorithms used to predict student performance in placement. In the first step we have gathered the last two years passed out students details from placement cell in Dr.N.G.P Arts and Science College. In the second step preprocessing was done on those data and attributes were selected for prediction and in the third step Decision tree algorithms such as ID3, CHAID, and C4.5 were implemented by using Rapid Miner tool. Validation is checked for the three algorithms and accuracy is found for them. The best algorithm based on the collected placement data is ID3 with an accuracy of 95.33%.
In Data mining applications, very large training data sets with several million records are common. Decision trees are very much powerful and excellent technique for both classification and prediction problems. Many decision tree construction algorithms have been proposed to develop and handle large or small training data. Some related algorithms are best for large data sets and some for small data sets. Each algorithm works best for its own criteria. The decision tree algorithms classify categorical and continuous attributes very well but it handles efficiently only a smaller data set. It consumes more time for large datasets. Supervised Learning In Quest (SLIQ) and Scalable Parallelizable Induction of Decision Tree (SPRINT) handles very large datasets. But SLIQ requires that the class labels should be available in main memory beforehand. SPRINT is best suited for large data sets and it removes all these memory restrictions. The research work deals with the automatic selection of decision tree algorithm based on training dataset size. This proposed system first prepares the training dataset size using the mathematical measure. The result training set size problem will be checked with the available memory space. If memory is very sufficient then the tree construction will continue. After the classifying the data, the accuracy of the classifier data set is estimated. The main advantages of the proposed method are that the system takes less time and avoids memory problem.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.