2022
DOI: 10.32604/cmc.2022.025711
|View full text |Cite
|
Sign up to set email alerts
|

LAME: Layout-Aware Metadata Extraction Approach for Research Articles

Abstract: The volume of academic literature, such as academic conference papers and journals, has increased rapidly worldwide, and research on metadata extraction is ongoing. However, high-performing metadata extraction is still challenging due to diverse layout formats according to journal publishers. To accommodate the diversity of the layouts of academic journals, we propose a novel LAyout-aware Metadata Extraction (LAME) framework equipped with the three characteristics (e.g., design of automatic layout analysis, co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 20 publications
0
4
0
Order By: Relevance
“…Recently, a research using the BERT model for automatic metadata extraction from Korean papers have been conducted. The authors of [7] proposed a metadata extraction method using Layout-MetaBERT, which was developed by pre-training the BERT model with metadata layout information of papers. However, in this research, training data from 70 types of Korean academic journals were used, which is a limited amount to cover overall journals.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…Recently, a research using the BERT model for automatic metadata extraction from Korean papers have been conducted. The authors of [7] proposed a metadata extraction method using Layout-MetaBERT, which was developed by pre-training the BERT model with metadata layout information of papers. However, in this research, training data from 70 types of Korean academic journals were used, which is a limited amount to cover overall journals.…”
Section: Related Workmentioning
confidence: 99%
“…However, in this research, training data from 70 types of Korean academic journals were used, which is a limited amount to cover overall journals. Unlike the work of [7], we constructed the training data from approximately 500 types of Korean academic journals, so that a new model using our data has higher coverage that can extract metadata from Korean papers in more various formats of journal. In this paper, we developed new models, KorSciBERT-ME-J and KorSciBERT-ME-J+C, by applying the concept of learning auxiliary sentences referring to [27].…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations