Background: Increase in the internet data has increased the priority in the data extraction accuracy. Accuracy here lies with what data the user has requested for and what has been retrieved. The same large data sets that need to be analyzed make the required information retrieval a challenging task. Objective: To propose a new algorithm in an improved way than the traditional methods to classify the category or group to which each training sentence belongs. Method: Identifying the category to which the input sentence belongs is achieved by analyzing the Noun and Verb of each training sentence. NLP is applied to each training sentence and the group or category classification is achieved using the proposed GENI algorithm so that the classifier is trained efficiently to extract the user requested information. Results: The input sentences are transformed into a data table by applying GENI algorithm for group categorization. Plotting the graph in R tool, the accuracy of the group extracted by the Classifier involving GENI approach is higher than that of Naive Bayes & Decision Trees. Conclusion: It remains a challenging task to extract the user-requested data, when the user query is complex. Existing techniques are based more on the fixed attributes, and when we move with respect to the fixed attributes, it becomes too complex or impossible for us to determine the common group from the base sentence. Existing techniques are more suitable to a smaller dataset, whereas the proposed GENI algorithm does not hold any restrictions for the Group categorization of larger data sets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.