2020
DOI: 10.23974/ijol.2020.vol5.1.158
|View full text |Cite
|
Sign up to set email alerts
|

Using NLP to Generate MARC Summary Fields for Notre Dame ’s Catholic Pamphlets

Abstract: Three NLP (Natural Language Processing) automated summarization techniques were tested on a special collection of Catholic Pamphlets acquired by Hesburgh Libraries. The automated summaries were generated after feeding the pamphlets as .pdf files into an OCR pipeline. Extensive data cleaning and text preprocessing were necessary before the computer summarization algorithms could be launched. Using the standard ROUGE F1 scoring technique, the Bert Extractive Summarizer technique had the best summarization score.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 10 publications
0
2
0
Order By: Relevance
“…This project tried to automatically generate a summary for each digitized pamphlet by using NLP's BERT Extractive technique and Gensim python package. 10 Other applications of ML techniques in the academic library include analyzing library operations such as acquisition. In 2019, Kevin W. Walker and Zhehan Jiang from the University of Alabama used a machine learning method called adaptive boosting (AdaBoost) to predict demand-driven acquisition (DDA).…”
Section: Literature Reviewmentioning
confidence: 99%
“…This project tried to automatically generate a summary for each digitized pamphlet by using NLP's BERT Extractive technique and Gensim python package. 10 Other applications of ML techniques in the academic library include analyzing library operations such as acquisition. In 2019, Kevin W. Walker and Zhehan Jiang from the University of Alabama used a machine learning method called adaptive boosting (AdaBoost) to predict demand-driven acquisition (DDA).…”
Section: Literature Reviewmentioning
confidence: 99%
“…Additionally, Zeng et al (2014) have experimented with using OpenCalais, a semantic analysis tool, to automatically extract and create access points for archival records as a response to the inability of traditional resource description methods to catch up with the growing body of information resources. Other experiments with AI-generated metadata have gone beyond either extraction or the assignment of keywords, and have used AI to craft entire descriptive summaries for digitized items (Flannery, 2020).…”
Section: Ai For Resource Description and Discoverymentioning
confidence: 99%