With the recent developments in speech synthesis via machine learning, this study explores incorporating linguistics knowledge to visualise and evaluate synthetic speech model training. If changes to the first and second formant (in turn, the vowel space) can be seen and heard in synthetic speech, this knowledge can inform speech synthesis technology developers. A speech synthesis model trained on a large General American English database was fine-tuned into a New Zealand English voice to identify if the changes in the vowel space of synthetic speech could be seen and heard. The vowel spaces at different intervals during the fine-tuning were analysed to determine if the model learned the New Zealand English vowel space. Our findings based on vowel space analysis show that we can visualise how a speech synthesis model learns the vowel space of the database it is trained on. Perception tests confirmed that humans could perceive when a speech synthesis model has learned characteristics of the speech database it is training on. Using the vowel space as an intermediary evaluation helps understand what sounds are to be added to the training database and build speech synthesis models based on linguistics knowledge.
The need for contactless vascular biometric systems has significantly increased. In recent years, deep learning has proven to be efficient for vein segmentation and matching. Palm and finger vein biometrics are well researched; however, research on wrist vein biometrics is limited. Wrist vein biometrics is promising due to it not having finger or palm patterns on the skin surface making the image acquisition process easier. This paper presents a deep learning-based novel low-cost end-to-end contactless wrist vein biometric recognition system. FYO wrist vein dataset was used to train a novel U-Net CNN structure to extract and segment wrist vein patterns effectively. The extracted images were evaluated to have a Dice Coefficient of 0.723. A CNN and Siamese Neural Network were implemented to match wrist vein images obtaining the highest F1-score of 84.7%. The average matching time is less than 3 s on a Raspberry Pi. All the subsystems were integrated with the help of a designed GUI to form a functional end-to-end deep learning-based wrist biometric recognition system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.