Temporal pitch class profiles -commonly referred to as a chromagrams -are the de facto standard signal representation for content-based methods of musical harmonic analysis, despite exhibiting a set of practical difficulties. Here, we present a novel, data-driven approach to learning a robust function that projects audio data into Tonnetz-space, a geometric representation of equal-tempered pitch intervals grounded in music theory. We apply this representation to automatic chord recognition and show that our approach out-performs the classification accuracy of previous chroma representations, while providing a mid-level feature space that circumvents challenges inherent to chroma.
As the number of document resources is continuously increasing, automatically extracting keyphrases from a document becomes one of the main issues in recent days. However, most previous works have tried to extract keyphrases from words in documents, so they overlooked latent keyphrases which did not appear in documents. Although latent keyphrases do not appear in documents, they can undertake an important role in text summarization and information retrieval because they implicate meaningful concepts or contents of documents. Also, they cover more than one fourth of the entire keyphrases in the real-world datasets and they can be utilized in short articles such as SNS which rarely have explicit keyphrases. In this paper, we propose a new approach that selects candidate keyphrases from the keyphrases of neighbor documents which are similar to the given document and evaluates the importance of the candidates with the individual words in the candidates. Experiment result shows that latent keyphrases can be extracted at a reasonable level.
Hidden Markov models (HMM) have been widely studied and applied over decades. The standard supervised learning method for HMM is maximum likelihood estimation (MLE) which maximizes the joint probability of training data. However, the most natural way of training would be finding the parameters that directly minimize the error rate of a given training set. In this article, we propose a novel learning method that minimizes the number of incorrectly decoded labels framewise. To do this, we construct a smooth function that is arbitrarily close to the exact frame error rate and minimize it directly using a gradient-based optimization algorithm. The proposed approach is intuitive and simple. We applied our method to the task of chord recognition in music, and the results show that it performs better than Maximum Likelihood Estimation and Minimum Classification Error.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.