Mingjie Zhan scite author profile

Mingjie Zhan

2Publications

30Citation Statements Received

28Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

DocStruct: A Multimodal Method to Extract Hierarchy Structure in Document for General Form Understanding

Wang¹,

Zhan²,

Liu³

et al. 2020

View full text Add to dashboard Cite

Form understanding depends on both textual contents and organizational structure. Although modern OCR performs well, it is still challenging to realize general form understanding because forms are commonly used and of various formats. The table detection and handcrafted features in previous works cannot apply to all forms because of their requirements on formats. Therefore, we concentrate on the most elementary components, the key-value pairs, and adopt multimodal methods to extract features. We consider the form structure as a tree-like or graph-like hierarchy of text fragments. The parent-child relation corresponds to the key-value pairs in forms. We utilize the state-of-the-art models and design targeted extraction modules to extract multimodal features from semantic contents, layout information, and visual images. A hybrid fusion method of concatenation and feature shifting is designed to fuse the heterogeneous features and provide an informative joint representation. We adopt an asymmetric algorithm and negative sampling in our model as well. We validate our method on two benchmarks, MedForm and FUNSD, and extensive experiments demonstrate the effectiveness of our method.

show abstract

DocStruct: A Multimodal Method to Extract Hierarchy Structure in Document for General Form Understanding

Wang

Zhan²,

Liu³

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mingjie Zhan

DocStruct: A Multimodal Method to Extract Hierarchy Structure in Document for General Form Understanding

DocStruct: A Multimodal Method to Extract Hierarchy Structure in Document for General Form Understanding

Contact Info

Product

Resources

About