2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) 2018
DOI: 10.1109/aipr.2018.8707407
|View full text |Cite
|
Sign up to set email alerts
|

End-to-End Text Classification via Image-based Embedding using Character-level Networks

Abstract: For analysing and/or understanding languages having no word boundaries based on morphological analysis such as Japanese, Chinese, and Thai, it is desirable to perform appropriate word segmentation before word embeddings. But it is inherently difficult in these languages. In recent years, various language models based on deep learning have made remarkable progress, and some of these methodologies utilizing characterlevel features have successfully avoided such a difficult problem. However, when a model is fed c… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
2

Relationship

2
0

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 16 publications
0
5
0
Order By: Relevance
“…We choose two classifiers for our framework. A character CNN (CLCNN) similar to Kitada et al (2018), but tuned to Arabic language, and a bidirectional gated recurrent unit (BiGRU) (Chung et al, 2014) based classifier. The outline of our framework is shown in Figure 2.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…We choose two classifiers for our framework. A character CNN (CLCNN) similar to Kitada et al (2018), but tuned to Arabic language, and a bidirectional gated recurrent unit (BiGRU) (Chung et al, 2014) based classifier. The outline of our framework is shown in Figure 2.…”
Section: Methodsmentioning
confidence: 99%
“…To solve these problems, character-based approaches utilizing deep learning methods mainly used in image processing have been proposed (Zhang et al, 2015;Shimada et al, 2016;Kitada et al, 2018). Zhang et al (2015) introduced a character-level CNN (CLCNN) that treats text as a raw signal at character level.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Embedding methods based on character images have been proposed with some excellent success (Chen et al, 2015;Sun et al, 2016;Yu et al, 2017;Sun et al, 2019;Dai and Cai, 2017;Shimada et al, 2016;Kitada et al, 2018;Ke and Hagiwara, 2017;Aldón Mínguez et al, 2016). These methods are also called glyphaware embedding as they generate embeddings that take into account the shape of the characters or subcharacters.…”
Section: Glyph-aware Natural Language Processingmentioning
confidence: 99%
“…For example, the following Japanese characters have a common form of "辶," which is a sub-character meaning of the related word road: "迫" (approach: come near the destination by road) and "追" (follow: track the road). In consideration of these characteristics of the language, several glyph-aware natural language processing (NLP) models have been proposed (Shimada et al, 2016;Kitada et al, 2018;Sun et al, 2019). These deep-learning-based models train input text as a sequence of character images and learn character embeddings from the images.…”
Section: Introductionmentioning
confidence: 99%