Acoustic data provide scientific and engineering insights in fields ranging from biology and communications to ocean and Earth science. We survey the recent advances and transformative potential of machine learning (ML), including deep learning, in the field of acoustics. ML is a broad family of statistical techniques for automatically detecting and utilizing patterns in data. Relative to conventional acoustics and signal processing, ML is data-driven. Given sufficient training data, ML can discover complex relationships between features and desired labels or actions, or between features themselves. With large volumes of training data, ML can discover models describing complex acoustic phenomena such as human speech and reverberation. ML in acoustics is rapidly developing with compelling results and significant future promise. We first introduce ML, then highlight ML developments in five acoustics research areas: source localization in speech processing, source localization in ocean acoustics, bioacoustics, seismic exploration, and environmental sounds in everyday scenes.
Machine learning classifiers are shown to outperform conventional matched field processing for a deep water (600 m depth) ocean acoustic-based ship range estimation problem in the Santa Barbara Channel Experiment when limited environmental information is known. Recordings of three different ships of opportunity on a vertical array were used as training and test data for the feed-forward neural network and support vector machine classifiers, demonstrating the feasibility of machine learning methods to locate unseen sources. The classifiers perform well up to 10 km range whereas the conventional matched field processing fails at about 4 km range without accurate environmental information.
This paper examines the relationship between conventional beamforming and linear supervised learning, then develops a nonlinear deep feed-forward neural network (FNN) for direction-of-arrival (DOA) estimation. First, conventional beamforming is reformulated as a real-valued, linear inverse problem in the weight space, which is compared to a support vector machine and a linear FNN model. In the linear formulation, DOA is quickly and accurately estimated for a realistic array calibration example. Then, a nonlinear FNN is developed for two-source DOA and for K-source DOA, where K is unknown. Two training methodologies are used: exhaustive training for controlled accuracy and random training for flexibility. The number of FNN model hidden layers, hidden nodes, and activation functions are selected using a hyperparameter search. In plane wave simulations, the 2-source FNN resolved incoherent sources with 1° resolution using a single snapshot, similar to Sparse Bayesian Learning (SBL). With multiple snapshots, K-source FNN achieved resolution and accuracy similar to Multiple Signal Classification and SBL for an unknown number of sources. The practicality of the deep FNN model is demonstrated on Swellex96 experimental data for multiple source DOA on a horizontal acoustic array.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.