The definition and extraction of actionable anomalous discords, i.e. pattern outliers, is a challenging problem in data analysis. It raises the crucial issue of identifying criteria that would render a discord more insightful than another one. In this paper, we propose an approach to address this by introducing the concept of prominent discord. The core idea behind this new concept is to identify dependencies among discords of varying lengths. How can we identify a discord that would be prominent? We propose an ordering relation, that ranks discords and we seek a set of prominent discords with respect to this ordering. Our contributions are 1) a formal definition, ordering relation and methods to derive prominent discords based on Matrix Profile techniques, and 2) their evaluation over large contextual climate data, covering 110 years of monthly data. The approach is generic and its pertinence shown over historical climate data.
In this paper we are interested in identifying insightful changes in climate observations series, through outlier detection techniques. Discords are outliers that cover a certain length instead of being a single point in the time series. The choice of the length can be critical, leading to works on computing variable length discords. This increases the number of discords, with potential overlapping, subsumption and reduced insightful results. In this work we introduce a hybrid approach to rank variable length discords and extract the most prominent ones, that can yield more impactful results. We propose a ranking function over extracted variable length discords that accounts for contained point anomalies. We investigate the combination of pattern wise anomaly detection, through the Matrix Profile paradigm, with two different point wise anomaly detectors. We experimented with MAD and PROPHET algorithms based on different concepts to extract point anomalies. We tested our approach on climate observations, representing monthly runoff time series between 1902 and 2005 over the West African region. Experimental results indicate that PROPHET combined with the Matrix Profile method, yields more qualitative rankings, through an extraction of higher values of extreme events within the variable length discords.
<p>Outliers detection generally aims at identifying extreme events and insightful changes in climate behavior. One important type of outlier is pattern outlier also called discord, where the outlier pattern detected covers a time interval instead of a single point in the time series. Machine learning contributes many algorithms and methods in this field especially unsupervised algorithms for different types of data time series. In a first submitted paper, we have investigated discord detection applied to climate-related impact observations. We have introduced the prominent discord notion, a contextual concept that derives a set of insightful discords by identifying dependencies among variable length discords, and which is ordered based on the number of discords they subsume.&#160;</p><p>Following this study, here we propose a ranking function based on the length of the first subsumed discord and the total length of the prominent discord, and make use of the powerful matrix profile technique. Preliminary results show that our approach, applied to monthly runoff timeseries between 1902 and 2005 over West Africa, detects both the emergence of long term change with the associated former climate regime, and the regional driest decade (1982-1992) of the 20th century (i.e. climate extreme event). In order to demonstrate the genericity and multiple insights gained by our method, we go further by evaluating the approach on other impact (e.g. crop data, fires, water storage) and climate (precipitation and temperature) observations, to provide similar results on different variables, extract relationships among them and identify what constitutes a prominent discord in such cases. A further step will consist in evaluating our methodology on climate and impact historical simulations, to determine if prominent discords highlighted in observations can be captured in climate and impact models.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.