The purposes of this study were to develop the Yonsei Face Database (YFace DB), consisting of both static and dynamic face stimuli for six basic emotions (happiness, sadness, anger, surprise, fear, and disgust), and to test its validity. The database includes selected pictures (static stimuli) and film clips (dynamic stimuli) of 74 models (50% female) aged between 19 and 40. Thousand four hundred and eighty selected pictures and film clips were assessed for the accuracy, intensity, and naturalness during the validation procedure by 221 undergraduate students. The overall accuracy of the pictures was 76%. Film clips had a higher accuracy, of 83%; the highest accuracy was observed in happiness and the lowest in fear across all conditions (static with mouth open or closed, or dynamic). The accuracy was higher in film clips across all emotions but happiness and disgust, while the naturalness was higher in the pictures than in film clips except for sadness and anger. The intensity varied the most across conditions and emotions. Significant gender effects were found in perception accuracy for both the gender of models and raters. Male raters perceived surprise more accurately in static stimuli with mouth open and in dynamic stimuli while female raters perceived fear more accurately in all conditions. Moreover, sadness and anger expressed in static stimuli with mouth open and fear expressed in dynamic stimuli were perceived more accurately when models were male. Disgust expressed in static stimuli with mouth open and dynamic stimuli, and fear expressed in static stimuli with mouth closed were perceived more accurately when models were female. The YFace DB is the largest Asian face database by far and the first to include both static and dynamic facial expression stimuli, and the current study can provide researchers with a wealth of information about the validity of each stimulus through the validation procedure.
We introduce Multi-level feature Fusion-based Periodicity Analysis Model (MF-PAM), a novel deep learning-based pitch estimation model that accurately estimates pitch trajectory in noisy and reverberant acoustic environments. Our model leverages the periodic characteristics of audio signals and involves two key steps: extracting pitch periodicity using periodic non-periodic convolution (PNP-Conv) blocks and estimating pitch by aggregating multi-level features using a modified bidirectional feature pyramid network (BiFPN). We evaluate our model on speech and music datasets and achieve superior pitch estimation performance compared to state-of-the-art baselines while using fewer model parameters. Our model achieves 99.20 % accuracy in pitch estimation on a clean musical dataset. Overall, our proposed model provides a promising solution for accurate pitch estimation in challenging acoustic environments and has potential applications in audio signal processing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.