In order to improve the problem of relatively low signal-to-noise in the extraction of underwater acoustic signals under the background of strong interference, a sparse decomposition-based underwater acoustic signal denoising method is proposed. The main work is as follows: First, the signal is decomposed into a complete dictionary that can reflect the characteristics of the signal environment through the singular value decomposition method. Then, a cyclic shift is used to construct a signal matrix and an initial dictionary. A newly generated super-complete dictionary is obtained through training and updating. The atoms in the dictionary matrix are correlated and orthogonalized by an adaptive orthogonal matching pursuit method. Finally, the linear combination of the atoms that can best reflect the characteristic information of the underwater acoustic signal is used to reconstruct the underwater acoustic signal to achieve the purpose of denoising. The noise filtering underwater noise signal is simulated and compared with the traditional filtering method. The simulation results show that this method improves the signal-to-noise ratio of the signal after sparse decomposition and reconstruction of the original underwater acoustic signal, and has a certain denoising ability under strong noise and various types of reverberation interference.
Lip reading aims at recognizing texts from a talking face without audio information. Due to the rapid development of deep learning techniques, researchers have made giant breakthroughs for both word-level and sentence-level English lip reading in recent years. Unlike English, it is difficult for Chinese to distinguish the lexical meanings, because Chinese is a tonal language. In addition, most of the existing Chinese lip reading datasets are designed for Mandarin, there are few for Cantonese. In this paper, we propose a word-level Cantonese lip reading dataset called CLRW which contains 800-word classes with 400,000 samples. For better practical applications, we do not limit gender, age, postures, light conditions, and speech speed to make CLRW closer to the real scene distribution. At first, we give a detailed description of the data collection process. Next, a novel two-branch network is proposed by us, named TBGL, which consists of a global branch and a local branch. The global branch models the whole lip and the local branch divides the feature into three parts to focus on subtle local lip motion. We jointly train these two branches and achieve comparable performance on LRW, CAS-VSR-W1K, and CLRW, respectively. Finally, we benchmark our dataset and perform a comprehensively analyze of the results, which demonstrate that CLRW is full of challenge, and it will bring a positive impact on further Cantonese lip reading tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.