2012
DOI: 10.1145/2399180.2399184
|View full text |Cite
|
Sign up to set email alerts
|

Computing similarity between items in a digital library of cultural heritage

Abstract: Large amounts of cultural heritage content have now been digitized and are available in digital libraries. However, these are often unstructured and difficult to navigate. Automatic techniques for identifying similar items in these collections could be used to improve navigation since it would allow items that are implicitly connected to be linked together and allow sets of similar items to be clustered. Europeana is a large digital library containing more than 20 million digital objects from a set of cultural… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0
1

Year Published

2014
2014
2021
2021

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 14 publications
(13 citation statements)
references
References 28 publications
0
12
0
1
Order By: Relevance
“…Approaches to similarity computing of cultural heritage can be divided into knowledgebased and corpus-based. The former calculates similarities between content by using external resources, while the latter analyzes the frequency distribution of words within the content (Aletras, Stevenson, & Clough, 2012). Since libraries and archives have most of their collections in the form of documents, they are able to adopt a corpus-based approach which compares and analyzes texts.…”
Section: Linking and Clustering Cultural Heritage With Social Tagsmentioning
confidence: 99%
See 2 more Smart Citations
“…Approaches to similarity computing of cultural heritage can be divided into knowledgebased and corpus-based. The former calculates similarities between content by using external resources, while the latter analyzes the frequency distribution of words within the content (Aletras, Stevenson, & Clough, 2012). Since libraries and archives have most of their collections in the form of documents, they are able to adopt a corpus-based approach which compares and analyzes texts.…”
Section: Linking and Clustering Cultural Heritage With Social Tagsmentioning
confidence: 99%
“…To do this, we constructed a term-document matrix and calculated the similarities by the cosine similarity method. Cosine similarity is a widely used approach in the information retrieval vector model; it treats each document (a set of words) as a vector and measures the similarity between the documents by calculating the cosine of the angle between them (Aletras et al, 2012;Kim et al, 2010;Shang, Zhang, Zhou, & Zhang, 2010). We treated artworks as documents and social tags as terms, and built a tags-artwork matrix containing artworks in columns and tags in rows.…”
Section: Linking Similar Artwork To Construct a Networkmentioning
confidence: 99%
See 1 more Smart Citation
“…Related or similar items: a component was developed to compute the similarity between pairs of items [3]. This basic functionality was used within a range of features in the PATHS system, such as clustering related items, forming intra-collection links and providing non-personalized recommendations.…”
Section: Evaluating the Componentsmentioning
confidence: 99%
“…To assess annotation quality, we computed the Pearson product-moment correlation of each annotator against the average of the rest of the annotators, as previously (Aletras, Stevenson, & Clough, 2012;Grieser, Baldwin, Bohnert, & Sonenberg, 2011). We then averaged all the correlations.…”
Section: Quality Of Annotationmentioning
confidence: 99%