Proceedings of the 2013 SIAM International Conference on Data Mining 2013
DOI: 10.1137/1.9781611972832.24
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Selection of Globally Optimal Rules on Large Imbalanced Data Based on Rule Coverage Relationship Analysis

Abstract: Rule-based anomaly and fraud detection systems often suffer from massive false alerts against a huge number of enterprise transactions. A crucial and challenging problem is to effectively select a globally optimal rule set which can capture very rare anomalies dispersed in large-scale background transactions. The existing rule selection methods which suffer significantly from complex rule interactions and overlapping in large imbalanced data, often lead to very high false positive rate. In this paper, we analy… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 7 publications
0
6
0
Order By: Relevance
“…In [12], several new types of patterns were identified, including pair patterns, cluster patterns, and approaches to quantify combined patterns such as combined association rules [71,72]. In [32], the interactions between rules are analyzed to avoid the duplication caused by multiple rules with overlapping functions while maintaining the detection quality of rule-based risk management. (6) Testing coupled data.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…In [12], several new types of patterns were identified, including pair patterns, cluster patterns, and approaches to quantify combined patterns such as combined association rules [71,72]. In [32], the interactions between rules are analyzed to avoid the duplication caused by multiple rules with overlapping functions while maintaining the detection quality of rule-based risk management. (6) Testing coupled data.…”
Section: Discussionmentioning
confidence: 99%
“…In fact, real-world data sets are all embedded with more or less couplings. In our practice, we have tested couplings in real-world business, including capital markets [10,8,54], online banking business [32], recommender systems including social media [33], and web text [17]. We have shown that even a very commonly used data set, or the highly manipulated UCI data, can be used for coupling learning evaluation, as we showed in [59,60,61] for coupled clustering and coupled ensemble clustering, although the couplings may not be very strong, and the performance difference by incorporating couplings may not be very obvious.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Li et al [26] proposed a solution for the selection of globally optimal (business) rules for detecting fraud. The proposed MCGminer algorithm is based on the Max Coverage Gain metric, which scores how good a rule performs globally.…”
Section: Fraud Discovery Approachesmentioning
confidence: 99%
“…These techniques are general and can be widely used and expanded for analyzing complex behavioral and social problems. Interested readers can find a detailed introduction from the cited references and also find some of our other efforts on quantifying similarity in categorical [37] and numerical objects [41], coupled clustering by incorporating coupled object similarity [37], analyzing non-IIDness at the method level to explore couplings between clusterings for coupled ensemble clustering [42] and considering the relations between patterns [28] and rules [43] for pattern relation analysis.…”
Section: Non-iidness Learning Case Studiesmentioning
confidence: 99%