Mia Vujović scite author profile

Speaker recognition is an important classification task, which can be solved using several approaches. Although building a speaker recognition model on a closed set of speakers under neutral speaking conditions is a well-researched task and there are solutions that provide excellent performance, the classification accuracy of developed models significantly decreases when applying them to emotional speech or in the presence of interference. Furthermore, deep models may require a large number of parameters, so constrained solutions are desirable in order to implement them on edge devices in the Internet of Things systems for real-time detection. The aim of this paper is to propose a simple and constrained convolutional neural network for speaker recognition tasks and to examine its robustness for recognition in emotional speech conditions. We examine three quantization methods for developing a constrained network: floating-point eight format, ternary scalar quantization, and binary scalar quantization. The results are demonstrated on the recently recorded SEAC dataset.

show abstract

Poređenje Sistema Za Sintezu Ekspresivnog Govora Sa Mogućnošću Kontrole Jačine Emocije

Vujović¹

2020

Zbornik radova FTN

View full text Add to dashboard Cite

U sintezi ekspresivnog govora važno je generisati emocionalno obojen govor koji odražava kompleksnost emocionalnih stanja. Brojni TTS sistemi emocije u sintetizovanom govoru modeluju u vidu diskretnih skupova, ali tek kada se uzmu u obzir i varijacije koje postoje unutar emotivnih stanja, generisani govor može biti nalik ljudskom. Ovaj rad obuhvata teorijsku analizu i poređenje dva inovativna sistema za sintezu ekspresivnog govora koji kompleksnost emocija modeluju u vidu kontinualnih vektora kojima je moguće manipulisati. Rezultati pokazuju da je pristup zasnovan na t-SNE embedding vektorima primjenljiv samo u slučaju specifičnih baza podataka, dok je drugi pristup, zasnovan na interpolaciji tačaka u embedding prostoru multi-speaker, multi-style modela, opštiji, ali zahtijeva dodatnu analizu.

show abstract

Labeling of Baropodometric Analysis Data Using Computer Vision Techniques in Classification of Foot Deformities

et al. 2023

View full text Add to dashboard Cite

Background and Objectives: Foot deformities are the basis of numerous disorders of the locomotor system. An optimized method of classification of foot deformities would enable an objective identification of the type of deformity since the current assessment methods do not show an optimal level of objectivity and reliability. The acquired results would enable an individual approach to the treatment of patients with foot deformities. Thus, the goal of this research study was the development of a new, objective model for recognizing and classifying foot deformities with the application of machine learning, by labeling baropodometric analysis data using computer vision methods. Materials and Methods: In this work, data from 91 students of the Faculty of Medicine and the Faculty of Sports and Physical Education, University of Novi Sad were used. Measurements were determined by using a baropodometric platform, and the labelling process was carried out in the Python programming language, using functions from the OpenCV library. Segmentation techniques, geometric transformations, contour detection and morphological image processing were performed on the images, in order to calculate the arch index, a parameter that gives information about the type of the foot deformity. Discussion: The foot over which the entire labeling method was applied had an arch index value of 0.27, which indicates the accuracy of the method and is in accordance with the literature. On the other hand, the method presented in our study needs further improvement and optimization, since the results of the segmentation techniques can vary when the images are not consistent. Conclusions: The labeling method presented in this work provides the basis for further optimization and development of a foot deformity classification system.

show abstract

Explicit Control of the Level of Expressiveness in DNN-Based Speech Synthesis by Embedding Interpolation

Nosek

Suzic

Vujović

et al. 2021

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mia Vujović

Initial Analysis of the Impact of Emotional Speech on the Performance of Speaker Recognition on New Serbian Emotional Database

Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech

Poređenje Sistema Za Sintezu Ekspresivnog Govora Sa Mogućnošću Kontrole Jačine Emocije

Labeling of Baropodometric Analysis Data Using Computer Vision Techniques in Classification of Foot Deformities

Explicit Control of the Level of Expressiveness in DNN-Based Speech Synthesis by Embedding Interpolation

Contact Info

Product

Resources

About