The classification of seismo-volcanic signals is performed manually at La Soufrière Volcano, which is time consuming and can be biased by subjectivity of the operator. We propose here a machine-learning-based model for classification of these signals, to handle large datasets and provide objective and reproducible results. To describe the properties of the signals, we used 104 statistical, entropy, and shape descriptor features computed from the time waveform, the spectrum, and the cepstrum. First, we trained a random forest classifier with a dataset provided by the Observatoire Volcanologique et Sismologique de Guadeloupe that consisted of 845 labeled events that were recorded from 2013 to 2018: 542 volcano-tectonic (VT); 217 Nested; and 86 long period (LP). We obtained an overalll accuracy of 72%. We determined that the VT class includes a variety of signals that cover the VT, Nested and LP classes. After visual inspection of the waveforms and spectral characteristics of the dataset, we introduced two new classes: Hybrid and Tornillo. A new random forest classifier was trained with this new information, and we obtained a much better overall accuracy of 82%. The model is very good for recognition of all event classes, except Hybrid events (67% accuracy, 70% precision). Hybrid events are often considered to be a mix of VT and LP events. This can be explained by the nature of this class and the physical processes that include both fracturing and resonating components with different modal frequencies. By analyzing the feature weights and by training a model with the most important features, we show that a subset of the 14 best features is sufficient to obtain a performance that is close to that of the model with the whole feature set. However, these best features are different from the 13 best features obtained for another volcano in Peru, with only one feature common to both sets of best features. Therefore, the model is not universal and it must be trained for each volcano, or it is too specific to the one station used here.
<p>Seismic activity at La Soufri&#232;re volcano of Guadeloupe is composed of various transient signals, which are classified manually by the Observatoire Volcanologique et Sismologique de Guadeloupe (OVSG-IPGP) considering waveforms recorded at several stations. Although five main types of signals are recognized in the data analysis by the observatory (Moretti et al., 2020), only three main classes readily distinguishable on seismic traces during the daily analytical protocol have been catalogued: Volcano-Tectonic events, Long-Period events and Nested events, each related to a distinct physical process.</p><p>Automatic classification of seismo-volcanic signals of La Soufri&#232;re was performed by using an architecture based on supervised learning, available at github.com/malfante/AAA. Seismic waveforms are transformed into a large set of features (34 features for each representation domain) computed from three representation domain of the signal (time, frequency, quefrency). The resulting vectors of features are then used for the modeling. We are using the Random Forest Classifier algorithm from the scikit-learn library.</p><p>At first, we trained the model with the dataset given by the OVSG consisting of 845 available labeled events (542 VT, 217 nested and 86 LP) recorded in the period 2013-2018. We obtained an average classification rate of 72 %. We determined that the VT class includes a variety of signals covering the LP, Nested and VT classes. Reviewing in details the waveforms and the spectral characteristics of the signals belonging to the 3 classes we then introduced Hybrid events and also defined a monochromatic class (so-called Tornillo) of LP signals, thus matching the full description of signals provided in Moretti et al. (2020).</p><p>Then, using the new information, a new model was trained with 5 classes and tested. We obtained a much better classification average rate of 84 %. The classification is excellent for Nested events (93 % of accuracy and precision) and Tornillo events (93% of accuracy and precision). The classification of VT events (90% accuracy, 89% precision) and LP events (86% accuracy, 82% precision) were also very good. The most difficult class to recognize is the Hybrid class (64 % accuracy, 69 % precision). Hybrid events are often mixed with VT and LP events. This may be explained by the nature of this class and the physical process that includes both a fracturing and a resonating component with different modal frequencies.</p><p>Machine learning is a powerful tool to handle large datasets. From a dataset built manually, the processing we applied allowed to obtain a reliable automatic classification by refining class definitions. This has important implications for observatory data processing during unrest and eruptive activity.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.