This paper leverages the graph-to-sequence method in neural text-to-speech (GraphTTS), which maps the graph embedding of the input sequence to spectrograms. The graphical inputs consist of node and edge representations constructed from input texts. The encoding of these graphical inputs incorporates syntax information by a GNN encoder module. Besides, applying the encoder of GraphTTS as a graph auxiliary encoder (GAE) can analyse prosody information from the semantic structure of texts. This can remove the manual selection of reference audios process and makes prosody modelling an end-to-end procedure. Experimental analysis shows that GraphTTS outperforms the state-of-theart sequence-to-sequence models by 0.24 in Mean Opinion Score (MOS). GAE can adjust the pause, ventilation and tones of synthesised audios automatically. This experimental conclusion may give some inspiration to researchers working on improving speech synthesis prosody.
Insulator failure is one of the important causes of railway power transmission accidents. In the automatic detection system of railway insulators, the detection and classification of insulator faults is a challenging task due to the complex background, small insulators and unobvious failures. In this article, we propose a railway insulator fault detection network based on convolutional neural network, which can detect faulty insulators from images with high resolution and complex background. The insulator fault detection network realizes the position detection and fault classification of the insulator by cascading the detection network and the fault classification network. The method of cascading two networks can reduce the amount of network calculations and improve the accuracy of fault classification. The insulator detection network uses low-resolution images for position detection, and this method can prevent the detection network from paying too much attention to the details of the image, thereby reducing the amount of network calculations. The fault classification network uses high-resolution insulator images for fault classification. The high-resolution images in this method have rich detailed information, which helps to improve the accuracy of fault classification. The trained insulator detection network and the fault classification network are cascaded to form an insulator fault detection network. The precision, recall and mAP values of the insulator fault detection network are 94.10%, 92.88% and 93.46% respectively. Experiment shows show that this network cascading method can significantly improve the accuracy and robustness of insulator fault detection.
This paper introduces a graphical representation approach of prosody boundary (GraphPB) in the task of Chinese speech synthesis, intending to parse the semantic and syntactic relationship of input sequences in a graphical domain for improving the prosody performance. The nodes of the graph embedding are formed by prosodic words, and the edges are formed by the other prosodic boundaries, namely prosodic phrase boundary (PPH) and intonation phrase boundary (IPH). Different Graph Neural Networks (GNN) like Gated Graph Neural Network (GGNN) and Graph Long Short-term Memory (G-LSTM) are utilised as graph encoders to exploit the graphical prosody boundary information. Graph-to-sequence model is proposed and formed by a graph encoder and an attentional decoder. Two techniques are proposed to embed sequential information into the graph-to-sequence text-tospeech model. The experimental results show that this proposed approach can encode the phonetic and prosody rhythm of an utterance. The mean opinion score (MOS) of these GNN models shows comparative results with the state-of-theart sequence-to-sequence models with better performance in the aspect of prosody. This provides an alternative approach for prosody modelling in end-to-end speech synthesis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.