2019 International Conference on Document Analysis and Recognition (ICDAR) 2019
DOI: 10.1109/icdar.2019.00191
|View full text |Cite
|
Sign up to set email alerts
|

Multi-modal Attention Network for Handwritten Mathematical Expression Recognition

Abstract: In this paper, we propose a novel stroke constrained attention network (SCAN) which treats stroke as the basic unit for encoder-decoder based online handwritten mathematical expression recognition (HMER). Unlike previous methods which use trace points or image pixels as basic units, SCAN makes full use of stroke-level information for better alignment and representation. The proposed SCAN can be adopted in both single-modal (online or offline) and multi-modal HMER. For single-modal HMER, SCAN first employs a CN… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 38 publications
(14 citation statements)
references
References 51 publications
0
14
0
Order By: Relevance
“…Systems I to VII were participating systems except for system III because of using private training data. We do not present results from [21], [22] and [23] because their models were trained on handwriting trajectory data. A trajectory provides handwriting order information, which is useful for distinguishing visually similar symbols (e.g., "α" and "a").…”
Section: E Comparison With the Proposed Methodsmentioning
confidence: 99%
“…Systems I to VII were participating systems except for system III because of using private training data. We do not present results from [21], [22] and [23] because their models were trained on handwriting trajectory data. A trajectory provides handwriting order information, which is useful for distinguishing visually similar symbols (e.g., "α" and "a").…”
Section: E Comparison With the Proposed Methodsmentioning
confidence: 99%
“…Hong (Hong et al 2019) employed residual connection in BiRNN to improve feature extraction. Besides, multi-modal learning was also introduced into HMER with both online and offline information for better encoding (Wang et al 2019) and decoding (Wang et al 2021). There are also approaches which apply data augmentation (Le and Nakagawa 2017;Le, Indurkhya, and Nakagawa 2019;Li et al 2020).…”
Section: Latex Modelingmentioning
confidence: 99%
“…Most HMER methods extensively adopt the sequence-to-sequence approach. The authors in [8,15,16,23,27,29,39,40,40,43,46] proposed an attention-based sequence-to-sequence model to convert the handwritten mathematical expression images into rep- resentational markup language LaTeX. Recently, Wu et al [31] designed a graph-to-graph(G2G) model that explores the HMEs structural relationship of the input formula and output markup, which significantly improve the performance.…”
Section: Related Workmentioning
confidence: 99%