Real-time ranking with concept drift using expert advice

Abstract-Power companies can benefit from the use of knowledge discovery methods and statistical machine learning for preventive maintenance. We introduce a general process for transforming historical electrical grid data into models that aim to predict the risk of failures for components and systems. These models can be used directly by power companies to assist with prioritization of maintenance and repair work. Specialized versions of this process are used to produce 1) feeder failure rankings, 2) cable, joint, terminator and transformer rankings, 3) feeder MTBF (Mean Time Between Failure) estimates and 4) manhole events vulnerability rankings. The process in its most general form can handle diverse, noisy, sources that are historical (static), semi-real-time, or real-time, incorporates state-of-the-art machine learning algorithms for prioritization (supervised ranking or MTBF), and includes an evaluation of results via cross-validation and blind test. Above and beyond the ranked lists and MTBF estimates are business management interfaces that allow the prediction capability to be integrated directly into corporate planning and decision support; such interfaces rely on several important properties of our general modeling approach: that machine learning features are meaningful to domain experts, that the processing of data is transparent, and that prediction results are accurate enough to support sound decision making. We discuss the challenges in working with historical electrical grid data that were not designed for predictive purposes. The "rawness" of these data contrasts with the accuracy of the statistical models that can be obtained from the process; these models are sufficiently accurate to assist in maintaining New York City's electrical grid.Index Terms-applications of machine learning, electrical grid, smart grid, knowledge discovery, supervised ranking, computational sustainability, reliability !

show abstract

“…To address this challenge, ODDS creates a new model every 4 hours on the current dataset. (See also [20,21,22]. )…”

Section: Feeder Ranking In Nycmentioning

confidence: 99%

Machine Learning for the New York City Power Grid

Rudin

Waltz

Anderson

et al. 2012

IEEE Trans. Pattern Anal. Mach. Intell.

223

102

View full text Add to dashboard Cite

show abstract

“…The paper proposes nine key factors to be considered by any search engine while ranking research papers. Becker [6] proposes a weighted majority algorithm, to rank electrical feeders based on their susceptibility to failure with real-time time-varying data gathered from electricity distribution system. As per the survey performed by Gama [7], concept-drift adaptation technique adopted in one application domain varies from the one adopted for another.…”

Section: Related Workmentioning

confidence: 99%

Identifying Concept-drift in Twitter Streams

Lifna¹,

Vijayalakshmi²

2015

Procedia Computer Science

View full text Add to dashboard Cite

We live in a Big Data society, where the dignity of data is like exchange of currency. What we produce as data affords as access to different application, benefits, services, delivery etc… In today's world communication is mainly through social networking sites like, Twitter, Facebook, and Google+. Huge amount of data that is being generated and shared across these micro-blogging sites, serves as a good source of Big Data Streams for analysis. As the topic of discussion changes drastically, the relevance of data is temporal, which leads to concept-drift. Identification and handling of this concept-drift in such Big Data Streams is present area of interest. The state-of-the-art techniques for identifying trending topics in such data streams mainly concentrates on the frequency of the topic as the key parameter. Concentrating on such a weak indicator, reduces the precision of mining. This study puts forward a novel approach towards identifying concept-drift by initially grouping topics into classes and assigning weight-age for each class, using sliding window processing model upon Twitter streams.

show abstract

“…Given this, and the inability to exactly compute a classifier's expected error, they propose a weight estimation procedure based on the classifier's performance on the previous batch. Two other approaches to weighting are due to Kolter and Maloof [50][51][52] and Becker and Arias [5]. In their weighting schemes, classifiers have their weights updated based on a constant multiplicative factor.…”

Section: Accuracy Weighted Ensemblesmentioning

confidence: 99%

Learning from streaming data with concept drift and imbalance: an overview

2012

View full text Add to dashboard Cite

The primary focus of machine learning has traditionally been on learning from data assumed to be sufficient and representative of the underlying fixed, yet unknown, distribution. Such restrictions on the problem domain paved the way for development of elegant algorithms with theoretically provable performance guarantees. As is often the case, however, real-world problems rarely fit neatly into such restricted models. For instance class distributions are often skewed, resulting in the "class imbalance" problem. Data drawn from non-stationary distributions is also common in real-world applications, resulting in the "concept drift" or "non-stationary learning" problem which is often associated with streaming data scenarios. Recently, these problems have independently experienced increased research attention, however, the combined problem of addressing all of the above mentioned issues has enjoyed relatively little research. If the ultimate goal of intelligent machine learning algorithms is to be able to address a wide spectrum of real-world scenarios, then the need for a general framework for learning from, and adapting to, a non-stationary environment that may introduce imbalanced data can be hardly overstated. In this paper, we first present an overview of each of these challenging areas, followed by a comprehensive review of recent research for developing such a general framework.

show abstract

Real-time ranking with concept drift using expert advice

Cited by 25 publications

References 42 publications

Machine Learning for the New York City Power Grid

Machine Learning for the New York City Power Grid

Identifying Concept-drift in Twitter Streams

Learning from streaming data with concept drift and imbalance: an overview

Contact Info

Product

Resources

About