Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations 2023
DOI: 10.18653/v1/2023.emnlp-demo.45
|View full text |Cite
|
Sign up to set email alerts
|

PaperMage: A Unified Toolkit for Processing, Representing, and Manipulating Visually-Rich Scientific Documents

Kyle Lo,
Zejiang Shen,
Benjamin Newman
et al.

Abstract: Despite growing interest in applying natural language processing (NLP) and computer vision (CV) models to the scholarly domain, scientific documents remain challenging to work with. They're often in difficult-to-use PDF formats, and the ecosystem of models to process them is fragmented and incomplete. We introduce papermage, an opensource Python toolkit for analyzing and processing visually-rich, structured scientific documents. papermage offers clean and intuitive abstractions for seamlessly representing and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
references
References 27 publications
0
0
0
Order By: Relevance