Lot of work has already been done for automatic text summarization. In this paper we have given a novel statistical approach to summarize the given Odia text. In our approach extraction of relevant sentences is done which can give the actual concept of the input document in a concise form. We rank each sentence in the document by assigning a weight value to each word of the sentence. The sentences are extracted as per their rank, which will lead to a good summary of the given text.
The task of retrieving the theme of a document and presenting a shorter form compared to the original text to the user is a challenging assignment. In this article, a hybrid approach to extract knowledge from a text document is presented, in which three key sentence level relationships in association with the Markov clustering algorithm is used to cluster sentences in the document. After clustering, sentences are ranked in each cluster and the highest ranked sentences in each cluster are merged. In the end, to get the final theme of the document, the Gradient boosting technique XGboost is used to compress the newly generated sentence. The DUC-2002 data set is used to evaluate the proposed system and it has been observed that the performance of the proposed system is better than other existing systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.