“…To apply machine learning (ML) to one of the standard DL circulation activities, namely text categorization [48], is part of the cognitive toolbox deployed [18]. In this context, ML is extensively being experimented with in different development areas and scenarios; to name but a few, for extracting image content from figures in scientific documents for categorization [33,34], automatically assessing and characterizing resource quality for educational DL [54,5], assessing the quality of scientific conferences [37], web-based collection development [42], automated document metadata extraction by support vector machines (SVM, [24]), automatic extraction of titles from general documents [27], information architecture [17], to remove duplicate documents [9], for collaborative filtering [59], for the automatic expansion of domain-specific lexicons by term categorization [3], for generating visual thesauri [45], or the semantic markup of documents [13]. As part of this direction of research, ML is being tested for its ability to reproduce parts of collections indexed by widespread classification schemes in a supervised learning setting, such as automatic text categorization using the Dewey Decimal Classification (DDC, [52]), or the Library of Congress Classification (LCC) from Library of Congress Subject Headings (LCSH, [20,43]).…”