Maria Teresa Artese scite author profile

Increasingly, the web produces massive volumes of texts, alone or associated with images, videos, photographs, together with some metadata, indispensable for their finding and retrieval. Keywords/keyphrases that characterize the semantic content of documents should be, automatically or manually, extracted, and/or associated with them. The paper presents a novel method to address the problem of the automatic unsupervised extraction of keywords/phrases from texts, expressed both in English and in Italian. The main feature of this approach is the integration of two methods that have given interesting results: word embedding models, such as Word2Vec or GloVe able to capture the semantics of words and their context, and clustering algorithms, able to identify the essence of the terms and choose the more significant one(s), to represent the contents of a text. In the paper, the datasets used are presented, together with the method implemented and the results obtained. These results will be discussed, commented, and compared with those obtained in previous experimentations, using TextRank, Rapid Automatic Keyword Extraction (RAKE), and TF-IDF.

show abstract

Integrating, Indexing and Querying the Tangible and Intangible Cultural Heritage Available Online: The QueryLab Portal

Artese

Gagliardi

2022

Information

View full text Add to dashboard Cite

Cultural heritage inventories have been created to collect and preserve the culture and to allow the participation of stakeholders and communities, promoting and disseminating their knowledges. There are two types of inventories: those who give data access via web services or open data, and others which are closed to external access and can be visited only through dedicated web sites, generating data silo problems. The integration of data harvested from different archives enables to compare the cultures and traditions of places from opposite sides of the world, showing how people have more in common than expected. The purpose of the developed portal is to provide query tools managing the web services provided by cultural heritage databases in a transparent way, allowing the user to make a single query and obtain results from all inventories considered at the same time. Moreover, with the introduction of the ICH-Light model, specifically studied for the mapping of intangible heritage, data from inventories of this domain can also be harvested, indexed and integrated into the portal, allowing the creation of an environment dedicated to intangible data where traditions, knowledges, rituals and festive events can be found and searched all together.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Maria Teresa Artese

Cataloging Intangible Cultural Heritage on the Web

Evaluating perceptual visual attributes in social and cultural heritage web sites

Semantic Unsupervised Automatic Keyphrases Extraction by Integrating Word Embedding with Clustering Methods

Integrating, Indexing and Querying the Tangible and Intangible Cultural Heritage Available Online: The QueryLab Portal

Contact Info

Product

Resources

About