Proceedings of the 28th Annual ACM Symposium on Applied Computing 2013
DOI: 10.1145/2480362.2480518
|View full text |Cite
|
Sign up to set email alerts
|

Random rules from data streams

Abstract: Existing works suggest that random inputs and random features produce good results in classification. In this paper we study the problem of generating random rule sets from data streams. One of the most interpretable and flexible models for data stream mining prediction tasks is the Very Fast Decision Rules learner (VFDR). In this work we extend the VFDR algorithm using random rules from data streams. The proposed algorithm generates several sets of rules. Each rule set is associated with a set of Natt attribu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 14 publications
(9 citation statements)
references
References 6 publications
0
9
0
Order By: Relevance
“…is the predicted severity value in range (0, 1). if Seed detects a drift then 10 Let i be the interval between the most recent two drifts; 11 Input i in V olDect; 12 Compute the drift severity sv and input sv in B; 13 if V olDec detects a volatility change then 14 Let p i be the most recent seen pattern; 15 Let p j be the pattern seen before p i ; 16 if node p i does not exist in N then 17 add node p i to N ; 18 add edge E(p j , p i ) to N ; 19 increment probability of E(p j , p i ); 20 if R(p j , p i ) does not exist then 21 Create empty reservoir R(p j , p i ); 22 Input all instances of B to R(p j , p i ); x is transited from pattern y; 5 Let R(x, y) denotes the mean of R(x, y); 6 Let P rob(y|x) denotes the probability of transiting to y given pattern x; 7 Let be a parameter in range (0, 1); 8 Let Seed.c be the threshold coefficient of Seed detector; 9 begin 10 Normalise all severity samples in R(x, y) in range (0, 1) for any pattern x and y; 11 foreach s 2 S do 12 Input s in Seed; 13 if Seed detects a drift then 14 Let i be the interval between the most recent two drifts; 15 Input i in V olDect; 16 if V olDec detects a volatility change then 17 Let p i be the most recent seen pattern; 18 Let H be the set of nodes where P rob(H j |p i ) > 0;…”
Section: Pressmentioning
confidence: 99%
“…is the predicted severity value in range (0, 1). if Seed detects a drift then 10 Let i be the interval between the most recent two drifts; 11 Input i in V olDect; 12 Compute the drift severity sv and input sv in B; 13 if V olDec detects a volatility change then 14 Let p i be the most recent seen pattern; 15 Let p j be the pattern seen before p i ; 16 if node p i does not exist in N then 17 add node p i to N ; 18 add edge E(p j , p i ) to N ; 19 increment probability of E(p j , p i ); 20 if R(p j , p i ) does not exist then 21 Create empty reservoir R(p j , p i ); 22 Input all instances of B to R(p j , p i ); x is transited from pattern y; 5 Let R(x, y) denotes the mean of R(x, y); 6 Let P rob(y|x) denotes the probability of transiting to y given pattern x; 7 Let be a parameter in range (0, 1); 8 Let Seed.c be the threshold coefficient of Seed detector; 9 begin 10 Normalise all severity samples in R(x, y) in range (0, 1) for any pattern x and y; 11 foreach s 2 S do 12 Input s in Seed; 13 if Seed detects a drift then 14 Let i be the interval between the most recent two drifts; 15 Input i in V olDect; 16 if V olDec detects a volatility change then 17 Let p i be the most recent seen pattern; 18 Let H be the set of nodes where P rob(H j |p i ) > 0;…”
Section: Pressmentioning
confidence: 99%
“…Recent studies show that 2.5 quintillion of bytes are produced every day, and out of that it is estimated that approximately 90% of overall stored data were created between 2012 and 2014 [12]. Since it might be difficult to extract useful knowledge from this abundant data, data mining techniques have been widely used for this task [13,14,8,15].…”
Section: Data Stream Miningmentioning
confidence: 99%
“…where trees tend to grow large, they become hard to understand since nodes appear in a specific context defined by tests at antecedent nodes [15]. In contrast, classifiers based on rules have the advantage of both modularity and interpretability, where each rule is independent of the others and can be interpreted in isolation from any other rules.…”
Section: Decision Rule Learningmentioning
confidence: 99%
“…Nowadays, a variety of computational systems, from credit card transactions, through wearable gadgets, to video surveillance, create enormous amounts of data, mostly in sequential fashion. Since this abundant -however raw -data do not provide interesting behavior patterns, data mining techniques, especially inductive learning, have been applied to extract useful knowledge from this type of data [13,10,21,6]. Extracting patterns from data streams and their usage in real-time is an effervescent research topic that has been tackled during the last decades [14,27,4,15,28].…”
Section: Learning From Data Streamsmentioning
confidence: 99%