Previous researches on accelerating remote sensing data processing are based on traditional von Neumann architecture, which separates storage and computation. Under the architecture, data must be obtained from the storage device first and then transmitted to Field Programmable Gate Array (FPGA) through the system bus. The power consumption caused by the data handling is huge, even exceeding the energy consumption required for data processing. In order to reduce the migration of remote sensing data and alleviate the problems of storage wall and power wall under von Neumann architecture, we design a remote sensing data processing platform based on the system architecture of computable storage, which uses Solid-State Disk (SSD) with computing capability to process the remote sensing data and realize accelerated remote sensing data processing. Based on this platform, applications related to remote sensing data processing such as compression, target detection, and image classification are deployed in SSD to improve the information acquisition rate in remote sensing data. Experimental results show that after compression being offloaded to SSD computing performance is improved by 2.27 times compared with the host CPU. Compared with the host GPU, the target detection speed is improved by 6.25% and the power consumption is reduced by 66.7%. Compared with the host, the detection speed of remote sensing image classification is improved by 78.8%, the power consumption is reduced by 70%, achieving the expected classification effect. The Remote Sensing data Processing Platform based on Computable Storage (CSRSPP) distributes various computing tasks to the SSD for execution, which not only improves the processing speed of computing tasks, but also greatly reduces the power consumption of the platform.
With the aim of adapting a source Text to Speech (TTS) model to synthesize a personal voice by using a few speech samples from the target speaker, voice cloning provides a specific TTS service. Although the Tacotron 2-based multi-speaker TTS system can implement voice cloning by introducing a d-vector into the speaker encoder, the speaker characteristics described by the d-vector cannot allow for the voice information of the entire utterance. This affects the similarity of voice cloning. As a vocoder, WaveNet sacrifices speech generation speed. To balance the relationship between model parameters, inference speed, and voice quality, a voice cloning method based on improved HiFi-GAN has been proposed in this paper. (1) To improve the feature representation ability of the speaker encoder, the x-vector is used as the embedding vector that can characterize the target speaker. (2) To improve the performance of the HiFi-GAN vocoder, the input Mel spectrum is processed by a competitive multiscale convolution strategy. (3) The one-dimensional depth-wise separable convolution is used to replace all standard one-dimensional convolutions, significantly reducing the model parameters and increasing the inference speed. The improved HiFi-GAN model remarkably reduces the number of vocoder model parameters by about 68.58% and boosts the model’s inference speed. The inference speed on the GPU and CPU has increased by 11.84% and 30.99%, respectively. Voice quality has also been marginally improved as MOS increased by 0.13 and PESQ increased by 0.11. The improved HiFi-GAN model exhibits outstanding performance and remarkable compatibility in the voice cloning task. Combined with the x-vector embedding, the proposed model achieves the highest score of all the models and test sets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.