Abstract-Automatic detection of linguistic negation in free text is a demanding need for many text processing applications including Sentiment Analysis. Our system uses online news archives from two different resources namely NDTV and The Hindu. While dealing with news articles, we performed three subtasks namely identifying the target; separation of good and bad news content from the good and bad sentiment expressed on the target and analysis of clearly marked opinion that is expressed explicitly, not needing interpretation or the use of world knowledge. In this paper, our main focus was on evaluating and comparing three sentiment analysis methods (two machine learning based and one lexical based) and also identifying the scope of negation in news articles for two political parties namely BJP and UPA by using three existing methodologies. They were Rest of the Sentence (RoS), Fixed Window Length (FWL) and Dependency Analysis (DA). Among the sentiment methods the best F-measure was SVM with the values 0.688 and 0.657 for BJP and UPA respectively. On the other hand, the F measures for RoS, FWL and DA were 0.58, 0.69 and 0.75 respectively. We observed that DA was performing better than the other two. Among 1675 sentences in the corpus, according to annotator I, 1,137 were positive and 538 were negative whereas according to annotator II, 1,130 were positive and 545 were negative. Further we also identified the score of each sentence and calculated the accuracy on the basis of average score of both the annotators.
The traditional Web stores huge amount of data in the form of Relational Databases (RDB) as it is good at
Automated essay grading or scoring systems are not more a myth they are reality. As on today, the human written (not hand written) essays are corrected not only by examiners / teachers also by machines. The TOEFL exam is one of the best examples of this application. The students' essays are evaluated both by human & web based automated essay grading system. Then the average is taken. Many researchers consider essays as the most useful tool to assess learning outcomes, implying the ability to recall, organize and integrate ideas, the ability to supply merely than identify interpretation and application of data. Automated Writing Evaluation Systems, also known as Automated Essay Assessors, might provide precisely the platform we need to explicate many of the features those characterize good and bad writing and many of the linguistic, cognitive and other skills those underline the human capability for both reading and writing. They can also provide time-totime feedback to the writers/students by using that the people can improve their writing skill. A meticulous research of last couple of years has helped us to understand the existing systems which are based on AI & Machine Learning techniques, NLP (Natural Language Processing) techniques and finding the loopholes and at the end to propose a system, which will work under Indian context, presently for English language influenced by local languages. Currently most of the essay grading systems is used for grading pure English essays or essays written in pure European languages. In India we have almost 21 recognized languages and influence of these local languages, in English, is very much here. Newspapers in Hyderabad sometimes print like --Now the time has come to say ‗albida' (good bye) to monsoon‖. Due to the influence of local languages and English written by nonnative English speakers (ie. Indians) the result of TOEFL exams has shown lower scores against Indian students (also Asian students). This paper focuses on the existing automated essay grading systems, basic technologies behind them and proposes a new framework to over come the problems of influence of local Indian languages in English essays while correcting and by providing proper feedback to the writers.
Text summarization is the process of distilling the most important information from a source to produce an abridged version for a particular user and task. When this is done by means of a computer, i.e. automatically, it calls as Automatic Text Summarization. Summarization can be classified into two approaches: extraction and abstraction. Extraction based summaries are produced by concatenating several sentences taken exactly as they appear in the texts being summarized. Abstraction based summaries are written to convey the main information in the input and may reuse phrases or clauses from it. This paper focuses on extraction approach. The goal of text summarization based on extraction approach is sentences selection. One of the methods to obtain the sentences is to assign some feature terms of sentences for the summary called ranking sentences and then select the best ones. The first step in summarization by extraction is the identification of important features. In our approach 1000 computer science related research papers are used as test documents. Each document is prepared by preprocessing process: sentence segmentation, tokenization, stop word removal, case folding, lemmatization, and stemming. Then, using important features, sentence filtering features, data compression features and finally calculating score for each sentence. The proposed text summarization is based on HMM tagger to improve the quality of the summary. Here, comparing our results with the existing summarizers which are Copernicus summarizer, Great summarizer and Microsoft Word 2007 summarizers etc. The proposed system is also tested with four types' similarities: Cosine, Jaccard, Jarowinkler and Sorenson similarities. The results show that the best quality for the summaries was obtained by feature terms method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.