Multi-modal Attention Network for Handwritten Mathematical Expression Recognition

Wang, Jiaming; Du, Jun; Zhang, Jianshu; Wang, Zi-Rui

doi:10.1109/icdar.2019.00191

Cited by 38 publications

(14 citation statements)

References 51 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Systems I to VII were participating systems except for system III because of using private training data. We do not present results from [21], [22] and [23] because their models were trained on handwriting trajectory data. A trajectory provides handwriting order information, which is useful for distinguishing visually similar symbols (e.g., "α" and "a").…”

Section: E Comparison With the Proposed Methodsmentioning

confidence: 99%

Improving Attention-Based Handwritten Mathematical Expression Recognition with Scale Augmentation and Drop Attention

Jin

Lai

et al. 2020

Preprint

View full text Add to dashboard Cite

Handwritten mathematical expression recognition (HMER) is an important research direction in handwriting recognition. The performance of HMER suffers from the twodimensional structure of mathematical expressions (MEs). To address this issue, in this paper, we propose a high-performance HMER model with scale augmentation and drop attention. Specifically, tackling ME with unstable scale in both horizontal and vertical directions, scale augmentation improves the performance of the model on MEs of various scales. An attention-based encoder-decoder network is used for extracting features and generating predictions. In addition, drop attention is proposed to further improve performance when the attention distribution of the decoder is not precise. Compared with previous methods, our method achieves state-of-the-art performance on two public datasets of CROHME 2014 and CROHME 2016.

show abstract

Section: E Comparison With the Proposed Methodsmentioning

confidence: 99%

Improving Attention-Based Handwritten Mathematical Expression Recognition with Scale Augmentation and Drop Attention

Jin

Lai

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…Hong (Hong et al 2019) employed residual connection in BiRNN to improve feature extraction. Besides, multi-modal learning was also introduced into HMER with both online and offline information for better encoding (Wang et al 2019) and decoding (Wang et al 2021). There are also approaches which apply data augmentation (Le and Nakagawa 2017;Le, Indurkhya, and Nakagawa 2019;Li et al 2020).…”

Section: Latex Modelingmentioning

confidence: 99%

TDv2: A Novel Tree-Structured Decoder for Offline Mathematical Expression Recognition

et al. 2022

AAAI

Self Cite

View full text Add to dashboard Cite

In recent years, tree decoders become more popular than LaTeX string decoders in the field of handwritten mathematical expression recognition (HMER) as they can capture the hierarchical tree structure of mathematical expressions. However previous tree decoders converted the tree structure labels into a fixed and ordered sequence, which could not make full use of the diversified expression of tree labels. In this study, we propose a novel tree decoder (TDv2) to fully utilize the tree structure labels. Compared with previous tree decoders, this new model does not require a fixed priority for different branches of a node during training and inference, which can effectively improve the model generalization capability. The input and output of the model make full use of the tree structure label, so that there is no need to find the parent node in the decoding process, which simplifies the decoding process and adds a prior information to help predict the node. We verified the effectiveness of each part of the model through comprehensive ablation experiments and attention visualization analysis. On the authoritative CROHME 14/16/19 datasets, our method achieves the state-of-the-art results.

show abstract

“…Most HMER methods extensively adopt the sequence-to-sequence approach. The authors in [8,15,16,23,27,29,39,40,40,43,46] proposed an attention-based sequence-to-sequence model to convert the handwritten mathematical expression images into rep- resentational markup language LaTeX. Recently, Wu et al [31] designed a graph-to-graph(G2G) model that explores the HMEs structural relationship of the input formula and output markup, which significantly improve the performance.…”

Section: Related Workmentioning

confidence: 99%

Syntax-Aware Network for Handwritten Mathematical Expression Recognition

Yuan¹,

Liu²,

Dikubab³

et al. 2022

Preprint

View full text Add to dashboard Cite

Handwritten mathematical expression recognition (HMER) is a challenging task that has many potential applications. Recent methods for HMER have achieved outstanding performance with an encoder-decoder architecture. However, these methods adhere to the paradigm that the prediction is made "from one character to another", which inevitably yields prediction errors due to the complicated structures of mathematical expressions or crabbed handwritings. In this paper, we propose a simple and efficient method for HMER, which is the first to incorporate syntax information into an encoder-decoder network. Specifically, we present a set of grammar rules for converting the LaTeX markup sequence of each expression into a parsing tree; then, we model the markup sequence prediction as a tree traverse process with a deep neural network. In this way, the proposed method can effectively describe the syntax context of expressions, alleviating the structure prediction errors of HMER. Experiments on three benchmark datasets demonstrate that our method achieves better recognition performance than prior arts. To further validate the effectiveness of our method, we create a largescale dataset consisting of 100k handwritten mathematical expression images acquired from ten thousand writers. The source code, new dataset † , and pre-trained models of this work will be publicly available.

show abstract

Multi-modal Attention Network for Handwritten Mathematical Expression Recognition

Cited by 38 publications

References 51 publications

Improving Attention-Based Handwritten Mathematical Expression Recognition with Scale Augmentation and Drop Attention

Improving Attention-Based Handwritten Mathematical Expression Recognition with Scale Augmentation and Drop Attention

TDv2: A Novel Tree-Structured Decoder for Offline Mathematical Expression Recognition

Syntax-Aware Network for Handwritten Mathematical Expression Recognition

Contact Info

Product

Resources

About