The Web has moved, slowly but steady, from a collection of documents towards a collection of structured data. Knowledge graphs have then emerged as a way of representing the knowledge encoded in such data as well as a tool to reason on them in order to extract new and implicit information. Knowledge graphs are currently used, e.g., to explain search results, to explore knowledge spaces, to semantically enrich textual documents or to feed knowledge intensive applications such as recommender systems. In this work we describe how to create and exploit a knowledge graph to supply a hybrid recommendation engine with information that builds on top of a collections of documents describing musical and sound items. Tags and textual descriptions are exploited to extract and link entities to external graphs such as WordNet and DBpedia which are in turn used to semantically enrich the initial data. By means of the knowledge graph we build, recommendations are computed using a feature combination hybrid approach. Two explicit graph feature mappings are formulated to obtain meaningful item feature representations able to catch the knowledge embedded in the graph. Those content features are further combined with additional collaborative information deriving from implicit user feedback. An extensive evaluation on historical data is performed over two different datasets. A dataset of sounds composed by tags, textual descriptions and user's download information gathered from Freesound.org, and a dataset of songs that mixes song textual descriptions with tags and user's listening habits extracted from Songfacts.com and Last.fm respectively. Results show significant improvements with respect to state of the art collaborative algorithms in both datasets. In addition, we show how the semantic expansion of the initial descriptions helps in achieving much better recommendation quality in terms of aggregated diversity and novelty.
Music genre labels are useful to organize songs, albums, and artists into broader groups that share similar musical characteristics. In this work, an approach to learn and combine multimodal data representations for music genre classification is proposed. Intermediate representations of deep neural networks are learned from audio tracks, text reviews, and cover art images, and further combined for classification. Experiments on single and multi-label genre classification are then carried out, evaluating the effect of the different learned representations and their combinations. Results on both experiments show how the aggregation of learned representations from different modalities improves the accuracy of the classification, suggesting that different modalities embed complementary information. In addition, the learning of a multimodal feature space increases the performance of pure audio representations, which may be specially relevant when the other modalities are available for training, but not at prediction time. Moreover, a proposed approach for dimensionality reduction of target labels yields major improvements in multi-label classification not only in terms of accuracy, but also in terms of the diversity of the predicted genres, which implies a more fine-grained categorization. Finally, a qualitative analysis of the results sheds some light on the behavior of the different modalities on the classification task.
An increasing amount of digital music is being published daily. Music streaming services o en ingest all available music, but this poses a challenge: how to recommend new artists for which prior knowledge is scarce? In this work we aim to address this so-called cold-start problem by combining text and audio information with user feedback data using deep network architectures. Our method is divided into three steps. First, artist embeddings are learned from biographies by combining semantics, text features, and aggregated usage data. Second, track embeddings are learned from the audio signal and available feedback data. Finally, artist and track embeddings are combined in a multimodal network. Results suggest that both spli ing the recommendation problem between feature levels (i.e., artist metadata and audio track), and merging feature embeddings in a multimodal approach improve the accuracy of the recommendations.
Music content creation, publication and dissemination has changed dramatically in the last few decades. The rate at which information about music is being created and shared on the web is growing exponentially, which opens the challenge to make sense of all this data. In this paper, we present and evaluate a Natural Language Processing pipeline aimed at the learning of a Music Knowledge Base entirely from scratch. Our approach starts off by collecting thousands of "song tidbits" from the songfacts.com website. Then, we combine a state-ofthe-art Entity Linking tool and a linguistically-motivated rule-based algorithm to extract semantic relations between pairs of entities. Relations with similar semantics are then grouped in semantic clusters by exploiting syntactic dependencies in relation patterns. Finally, a novel confidence measure over the set of extracted relations is introduced as a refinement step. Evaluation is carried out intrinsically, by assessing each component of the pipeline, as well as in an extrinsic task, namely Music Recommendation. An important contribution of our method is its ability to discover in text novel facts with high precision, which are missing in current generic and music-specific knowledge repositories. We release the datasets generated with our pipeline, together with the evaluation data, for the use and scrutiny of the community.
This paper describes the SemEval 2018 Shared Task on Hypernym Discovery. We put forward this task as a complementary benchmark for modeling hypernymy, a problem which has traditionally been cast as a binary classification task, taking a pair of candidate words as input. Instead, our reformulated task is defined as follows: given an input term, retrieve (or discover) its suitable hypernyms from a target corpus. We proposed five different subtasks covering three languages (English, Spanish, and Italian), and two specific domains of knowledge in English (Medical and Music). Participants were allowed to compete in any or all of the subtasks. Overall, a total of 11 teams participated, with a total of 39 different systems submitted through all subtasks. Data, results and further information about the task can be found at https://competitions. codalab.org/competitions/17119.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.