Text-to-speech (TTS) synthesizers have been widely used as a vital assistive tool in various fields. Traditional sequence-to-sequence (seq2seq) TTS such as Tacotron2 uses a single soft attention mechanism for encoder and decoder alignment tasks, which is the biggest shortcoming that incorrectly or repeatedly generates words when dealing with long sentences. It may also generate sentences with run-on and wrong breaks regardless of punctuation marks, which causes the synthesized waveform to lack emotion and sound unnatural. In this paper, we propose an end-to-end neural generative TTS model that is based on the deep-inherited attention (DIA) mechanism along with an adjustable local-sensitive factor (LSF). The inheritance mechanism allows multiple iterations of the DIA by sharing the same training parameter, which tightens the token–frame correlation, as well as fastens the alignment process. In addition, LSF is adopted to enhance the context connection by expanding the DIA concentration region. In addition, a multi-RNN block is used in the decoder for better acoustic feature extraction and generation. Hidden-state information driven from the multi-RNN layers is utilized for attention alignment. The collaborative work of the DIA and multi-RNN layers contributes to outperformance in the high-quality prediction of the phrase breaks of the synthesized speech. We used WaveGlow as a vocoder for real-time, human-like audio synthesis. Human subjective experiments show that the DIA-TTS achieved a mean opinion score (MOS) of 4.48 in terms of naturalness. Ablation studies further prove the superiority of the DIA mechanism for the enhancement of phrase breaks and attention robustness.
The rapid classification of micro-particles has a vast range of applications in biomedical sciences and technology. In the given study, a prototype has been developed for the rapid detection of particle size using multi-angle dynamic light scattering and a machine learning approach by applying a support vector machine. The device consisted of three major parts: a laser light, an assembly of twelve sensors, and a data acquisition system. The laser light with a wavelength of 660 nm was directed towards the prepared sample. The twelve different photosensors were arranged symmetrically surrounding the testing sample to acquire the scattered light. The position of the photosensor was based on the Mie scattering theory to detect the maximum light scattering. In this study, three different spherical microparticles with sizes of 1, 2, and 4 μm were analyzed for the classification. The real-time light scattering signals were collected from each sample for 30 min. The power spectrum feature was evaluated from the acquired waveforms, and then recursive feature elimination was utilized to filter the features with the highest correlation. The machine learning classifiers were trained using the features with optimum conditions and the classification accuracies were evaluated. The results showed higher classification accuracies of 94.41%, 94.20%, and 96.12% for the particle sizes of 1, 2, and 4 μm, respectively. The given method depicted an overall classification accuracy of 95.38%. The acquired results showed that the developed system can detect microparticles within the range of 1–4 μm, with detection limit of 0.025 mg/ml. Therefore, the current study validated the performance of the device, and the given technique can be further applied in clinical applications for the detection of microbial particles.
Background: Low-resolution magnetic resonance imaging (MRI) has high imaging speed, but the image details cannot meet the needs of clinical diagnosis. More and more researchers are interested in neural network-based reconstruction methods. How to effectively process the super-resolution reconstruction of the low-resolution images has become highly valuable in clinical applications. Methods: We introduced Super-Resolution Convolution Neural Network (SRCNN) into the reconstruction of magnetic resonance images. The SRCNN consists of three layers, the image feature extraction layer, the nonlinear mapping layer, and the reconstruction layer. For the feature extraction layer, a multi-scale feature extraction (MFE) method was used to extract the features in different scales by involving three different levels of views, which is superior to the original feature extraction in views with fixed size. Compared with the original feature extraction only in fixed size views, we used three different levels of views to extract the features of different scales. This MFE could also be combined with residual learning to improve the performance of MRI super-resolution reconstruction. The proposed network is an end-to-end architecture. Therefore, no manual intervention or multi-stage calculation is required in practical applications. The structure of the network is extremely simple by omitting the fully connected layers and the pooling layers from traditional Convolution Neural Network. Results and Conclusions: After comparative experiments, the effectiveness of the MFE SRCNN-based network in super-resolution reconstruction of MR images has been greatly improved. The performance is significantly improved in terms of evaluation indexes peak signal-to-noise ratio and structural similarity index measure, and the detail recovery of images is also improved.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.