Random rules from data streams

Almeida, Ezilda; Kosina, Petr; Gama, João

doi:10.1145/2480362.2480518

Cited by 14 publications

(9 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…is the predicted severity value in range (0, 1). if Seed detects a drift then 10 Let i be the interval between the most recent two drifts; 11 Input i in V olDect; 12 Compute the drift severity sv and input sv in B; 13 if V olDec detects a volatility change then 14 Let p i be the most recent seen pattern; 15 Let p j be the pattern seen before p i ; 16 if node p i does not exist in N then 17 add node p i to N ; 18 add edge E(p j , p i ) to N ; 19 increment probability of E(p j , p i ); 20 if R(p j , p i ) does not exist then 21 Create empty reservoir R(p j , p i ); 22 Input all instances of B to R(p j , p i ); x is transited from pattern y; 5 Let R(x, y) denotes the mean of R(x, y); 6 Let P rob(y|x) denotes the probability of transiting to y given pattern x; 7 Let be a parameter in range (0, 1); 8 Let Seed.c be the threshold coefficient of Seed detector; 9 begin 10 Normalise all severity samples in R(x, y) in range (0, 1) for any pattern x and y; 11 foreach s 2 S do 12 Input s in Seed; 13 if Seed detects a drift then 14 Let i be the interval between the most recent two drifts; 15 Input i in V olDect; 16 if V olDec detects a volatility change then 17 Let p i be the most recent seen pattern; 18 Let H be the set of nodes where P rob(H j |p i ) > 0;…”

Section: Pressmentioning

confidence: 99%

Learning in the presence of class imbalance and concept drift

et al. 2019

View full text Add to dashboard Cite

Section: Pressmentioning

confidence: 99%

Learning in the presence of class imbalance and concept drift

et al. 2019

View full text Add to dashboard Cite

“…Recent studies show that 2.5 quintillion of bytes are produced every day, and out of that it is estimated that approximately 90% of overall stored data were created between 2012 and 2014 [12]. Since it might be difficult to extract useful knowledge from this abundant data, data mining techniques have been widely used for this task [13,14,8,15].…”

Section: Data Stream Miningmentioning

confidence: 99%

“…where trees tend to grow large, they become hard to understand since nodes appear in a specific context defined by tests at antecedent nodes [15]. In contrast, classifiers based on rules have the advantage of both modularity and interpretability, where each rule is independent of the others and can be interpreted in isolation from any other rules.…”

Section: Decision Rule Learningmentioning

confidence: 99%

A survey on feature drift adaptation: Definition, benchmark, challenges and future directions

Barddal

Gomes

Enembreck

et al. 2017

Journal of Systems and Software

View full text Add to dashboard Cite

Data stream mining is a fast growing research topic due to the ubiquity of data in several real-world problems. Given their ephemeral nature, data stream sources are expected to undergo changes in data distribution, a phenomenon called concept drift. This paper focuses on one specific type of drift that has not yet been thoroughly studied, namely feature drift. Feature drift occurs whenever a subset of features becomes, or ceases to be, relevant to the learning task, thus, learners must detect and adapt to these changes accordingly. We survey existing work on feature drift adaptation in both explicit and implicit approaches. Additionally, we benchmark several algorithms and a naive proposal in synthetic and real-world datasets. The results from our experiments indicate the need for future research in this area as even naive approaches produced gains in accuracy while reducing resources usage. Finally, we state current research topics, challenges and future directions for feature drift adaptation.

show abstract

“…Nowadays, a variety of computational systems, from credit card transactions, through wearable gadgets, to video surveillance, create enormous amounts of data, mostly in sequential fashion. Since this abundant -however raw -data do not provide interesting behavior patterns, data mining techniques, especially inductive learning, have been applied to extract useful knowledge from this type of data [13,10,21,6]. Extracting patterns from data streams and their usage in real-time is an effervescent research topic that has been tackled during the last decades [14,27,4,15,28].…”

Section: Learning From Data Streamsmentioning

confidence: 99%