Testing a Genre-Enabled Application: A Preliminary Assessment

Santín, Marina; Rosso, Rosso

doi:10.14236/ewic/fdia2008.7

Cited by 3 publications

(1 citation statement)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Document classification, as a challenging task in the field of Natural Language Processing (NLP), typically assigns one or more class labels to a document according to its subject or other attributes, e.g., author and topic. Generally, it has a broad application in the area of sentiment classification [1,2], document ranking [3], genre classification [4] and topic labeling [5], etc.…”

Section: Introductionmentioning

confidence: 99%

Self-Interaction Attention Mechanism-Based Text Representation for Document Classification

et al. 2018

View full text Add to dashboard Cite

Document classification has a broad application in the field of sentiment classification, document ranking and topic labeling, etc. Previous neural network-based work has mainly focused on investigating a so-called forward implication, i.e., the preceding text segments are taken as the context of the following text segments when generating the text representation. Such a scenario typically ignores the fact that the semantics of a document are a product of the mutual implication of all text segments in a document. Thus, in this paper, we introduce a concept of interaction and propose a text representation model with Self-interaction Attention Mechanism (TextSAM) for document classification. In particular, we design three aggregated strategies to integrate the interaction into a hierarchical architecture for document classification, i.e., averaging the interaction, maximizing the interaction and adding one more attention layer on the interaction, which leads to three models, i.e., TextSAM AVE , TextSAM MAX and TextSAM ATT , respectively. Our comprehensive experimental results on two public datasets, i.e., Yelp 2016 and Amazon Reviews (Electronics), show that our proposals can significantly outperform the state-of-the-art neural-based baselines for document classification, presenting a general improvement in terms of accuracy ranging from 5.97% to 14.05% against the best baseline. Furthermore, we find that our proposals with a self-interaction attention mechanism can obviously alleviate the impact brought by the increase of sentence number as the relative improvement of our proposals against the baselines are enlarged when the sentence number increases.

show abstract

Section: Introductionmentioning

confidence: 99%

Self-Interaction Attention Mechanism-Based Text Representation for Document Classification

et al. 2018

View full text Add to dashboard Cite

show abstract

Cross-Testing a Genre Classification Model for the Web

Santini¹

2010

Text, Speech and Language Technology

View full text Add to dashboard Cite

Classification Accuracy by Deviation-based Classification Method with the Number of Training Documents

Lee

2014

Journal of Digital Convergence

View full text Add to dashboard Cite

This paper presents a novel approach at the intersection of machine learning and number theory, focusing on the classification of prime and non-prime numbers. At the core of our research is the development of a highly sparse encoding method, integrated with conventional neural network architectures. This combination has shown promising results, achieving a recall of over 99% in identifying prime numbers and 79% for nonprime numbers from an inherently imbalanced sequential series of integers, while exhibiting rapid model convergence before the completion of a single training epoch. An interesting aspect of our findings includes the analysis of false positives, where we found a consistent misclassification of certain non-prime numbers, particularly semi-primes. We performed training using 10 6 integers starting from a specified integer and tested on a different range of 2 × 10 6 integers extending from 10 6 to 3 × 10 6 , offset by the same starting integer. While constrained by the memory capacity of our resources, which limited our analysis to a span of 3 × 10 6 , we believe that our study contribute to the application of machine learning in prime number analysis. This work aims to demonstrate the potential of such applications and hopes to inspire further exploration and possibilities in diverse fields.

show abstract

Testing a Genre-Enabled Application: A Preliminary Assessment

Cited by 3 publications

References 17 publications

Self-Interaction Attention Mechanism-Based Text Representation for Document Classification

Self-Interaction Attention Mechanism-Based Text Representation for Document Classification

Cross-Testing a Genre Classification Model for the Web

Classification Accuracy by Deviation-based Classification Method with the Number of Training Documents

Contact Info

Product

Resources

About