2023
DOI: 10.14429/djlit.43.01.18619
|View full text |Cite
|
Sign up to set email alerts
|

Automated Knowledge Organization AI ML based Subject Indexing System for Libraries

Abstract: The research study as reported here is an attempt to explore the possibilities of an AI/ML-based semi-automated indexing system in a library setup to handle large volumes of documents. It uses the Python virtual environment to install and configure an open source AI environment (named Annif) to feed the LOD (Linked Open Data) dataset of Library of Congress Subject Headings (LCSH) as a standard KOS (Knowledge Organization System). The framework deployed the Turtle format of LCSH after cleaning the file with Sko… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(1 citation statement)
references
References 0 publications
0
1
0
Order By: Relevance
“…This The overview diagram of our data processing and computational framework starting from data and ending with visualizations. tool was chosen mainly for the following reasons: (1) it's proof-tested for quality and robustness by previous works (see, e.g., Suominen et al, 2022;Ahmed et al, 2023;Inkinen et al, 2023), (2) it's used by various public libraries in Finland, Germany and Sweden, ( 3) its opensource and trainable with own data, (4) it comes pretrained for Finnish, English and Swedish, and ( 5) it uses generic YSO ontology suitable for all types of texts. Annif 's framework comprises a lexical subject indexing algorithm for finding correlations between subjects in the vocabulary and words in documents, a text classification algorithm, and a general-purpose machine learning algorithm.…”
Section: Computation Of Topicsmentioning
confidence: 99%
“…This The overview diagram of our data processing and computational framework starting from data and ending with visualizations. tool was chosen mainly for the following reasons: (1) it's proof-tested for quality and robustness by previous works (see, e.g., Suominen et al, 2022;Ahmed et al, 2023;Inkinen et al, 2023), (2) it's used by various public libraries in Finland, Germany and Sweden, ( 3) its opensource and trainable with own data, (4) it comes pretrained for Finnish, English and Swedish, and ( 5) it uses generic YSO ontology suitable for all types of texts. Annif 's framework comprises a lexical subject indexing algorithm for finding correlations between subjects in the vocabulary and words in documents, a text classification algorithm, and a general-purpose machine learning algorithm.…”
Section: Computation Of Topicsmentioning
confidence: 99%