Automatic identification of light stop words for Persian information retrieval systems

Sadeghi, Maryam; Vegas, Jesús

doi:10.1177/0165551514530655

Cited by 16 publications

(12 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…By assessing the results for EElectronics and PElectronics, it can be found that the results for EElectronics are slightly better. Using a complex script is the main challenge in Persian writing [21,51,52]. For example, one of the issues in Persian text mining is the wide variety of declensional suffixes.…”

Section: Quantitative Analysismentioning

confidence: 99%

Integrating word status for joint detection of sentiment and aspect in reviews

Bagheri

2018

Journal of Information Science

View full text Add to dashboard Cite

A crucial task in sentiment analysis is aspect detection: the step of selecting the aspects on which opinions are expressed. This step anticipates the step of determining whether the opinions on aspects are positive or negative. This article proposes a novel probabilistic generative topic model for aspect-based sentiment analysis which is able to discover the latent structure of a large collection of review documents. The proposed joint sentiment-aspect detection model (SAM) is a generative topic model that incorporates the structure of review sentences for detecting aspects and sentiments simultaneously. The intuitions behind the SAM are that from generating documents by latent single- and multi-word topics, modelling the word distribution for each topic and learning of the prior distribution over topics in sentences of documents. SAM introduces word status so that the model can decide when to sample from a bigram distribution or a unigram distribution and integrates all these components into one combined model for aspect-based sentiment analysis. We evaluate SAM both qualitatively and quantitatively to show that the model is indeed able to perform the task effectively and improves significantly over standard joint sentiment-aspect models. The proposed model can easily be transformed between domains or languages and can detect the polarity of text data at various levels. However, for the quantitative analysis, we mainly focus on presenting the results for the document-level sentiment classification.

show abstract

Section: Quantitative Analysismentioning

confidence: 99%

Integrating word status for joint detection of sentiment and aspect in reviews

Bagheri

2018

Journal of Information Science

View full text Add to dashboard Cite

show abstract

“…Stop lists have been generated for other languages, such as Chinese (Zou et al, 2006), Thai (Daowadung and Chen, 2012) and Farsi (Sadeghi and Vegas, 2014), using uses similar frequency threshold approaches, are susceptible to the same issues discussed here.…”

Section: Introductionmentioning

confidence: 99%

Proceedings of Workshop for NLP Open Source Software (NLP-OSS)

Park¹,

Hagiwara²,

Milajevs³

et al. 2018

View full text Add to dashboard Cite

Open-source software makes sophisticated technologies available to a wide audience. Arguably, most people applying language processing and machine learning techniques rely on popular open source tools targeted at these applications. Users may themselves be incapable of implementing the underlying algorithms. Users may or may not have extensive training to critically conduct experiments with these tools. As maintainers of popular scientific software, we should be aware of our user base, and consider the ways in which our software design and documentation can lead or mislead users with respect to scientific best practices. In this talk, I will present some examples of these risks, primarily drawn from my experience developing Scikit-learn. For example: How can we help users avoid data leakage in crossvalidation? How can we help users report precisely which algorithm or metric was used in an experiment? Volunteer OSS maintainers have limited ability to see and manage these risks, and need the scientific community's assistance to get things right in design, implementation and documentation. Biography Joel Nothman began contributing to the Scientific Python ecosystem of opensource software as a research student at the University of Sydney in 2008. He has since made substantial contributions to the NLTK, Scipy, Pandas and IPython packages among others, but presently puts most

show abstract

“…In order to better improve and use the system, we add some auxiliary equipment and materials to the online management and training environment, where efficient intelligence supporting is one of the main characteristics. There are already many systems supporting various types of intelligence, including attendance management , automatic identification ID card information , and automatic mark examination papers . However, to separate the various functions will affect the management efficiency, and increasing the costs of learning.…”

Section: Introductionmentioning

confidence: 99%

“…There are already many systems Correspondence to H. Xu (railway_dragon@163.com). supporting various types of intelligence, including attendance management [16], automatic identification ID card information [17], and automatic mark examination papers [18]. However, to separate the various functions will affect the management efficiency, and increasing the costs of learning.…”

Section: Introductionmentioning

confidence: 99%

Construction and evaluation of PHP‐based management and training system for electrical power laboratory

et al. 2016

Comp Applic In Engineering

View full text Add to dashboard Cite

ABSTRACT:The online management and training of information for laboratories has rarely been discussed in the traditional education system design. However, it plays a vital role in informatization and automation of electrical power laboratory. Using PHP: Hypertext Preprocessor (PHP), we have designed and developed an online management and training environment targeted for electrical power laboratory. In this article, the Model-ViewController (MVC) architecture cooperative designed for the management and training system modules was provided, and key steps of developing such modules using PHP were described. A questionnaire survey was conducted for students to use the system, and their attitudes toward the system were calculated and analyzed. Results showed that management and training system for electrical power laboratory can lead to preferable learning interest and outcome. ß 2016 Wiley Periodicals, Inc. Comput Appl Eng Educ 24:371-381, 2016; View this article online at wileyonlinelibrary.com/journal/cae;

show abstract

Automatic identification of light stop words for Persian information retrieval systems

Cited by 16 publications

References 10 publications

Integrating word status for joint detection of sentiment and aspect in reviews

Integrating word status for joint detection of sentiment and aspect in reviews

Proceedings of Workshop for NLP Open Source Software (NLP-OSS)

Construction and evaluation of PHP‐based management and training system for electrical power laboratory

Contact Info

Product

Resources

About