2020
DOI: 10.1109/access.2020.2991328
|View full text |Cite
|
Sign up to set email alerts
|

A Constant Time Complexity Spam Detection Algorithm for Boosting Throughput on Rule-Based Filtering Systems

Abstract: Along with the barbarous growth of spams, anti-spam technologies including rule-based approaches and machine-learning thrive rapidly as well. In antispam industry, the rule-based systems (RBS) becomes the most prominent methods for fighting spam due to its capability to enrich and update rules remotely. However, the antispam filtering throughput is always a great challenge of RBS. Especially, the explosively spreading of obfuscated words leads to frequent rule update and extensive rule vocabulary expansion. Th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 21 publications
(10 citation statements)
references
References 21 publications
0
9
0
Order By: Relevance
“…They usually use dynamic Markov compression [20] and partial matching [21] to evaluate the model. Rule-based filtering systems [22] using behavioral methods or linguistic methods have also been proposed to score and classify texts. Deep Learning [23,24,25] is also beginning to be used to detect spam and algorithms such as CNN [26] and LSTM [27] are gaining ground.…”
Section: Resultsmentioning
confidence: 99%
“…They usually use dynamic Markov compression [20] and partial matching [21] to evaluate the model. Rule-based filtering systems [22] using behavioral methods or linguistic methods have also been proposed to score and classify texts. Deep Learning [23,24,25] is also beginning to be used to detect spam and algorithms such as CNN [26] and LSTM [27] are gaining ground.…”
Section: Resultsmentioning
confidence: 99%
“…It works using text categorization and in recent times, various machine learning techniques have been applied to text categorization or Anti-Spam Filtering like Rule Learning (Cohen 1996) [ 27 ], Naïve Bayes (Sahami et al, 1998; Androutsopoulos et al, 2000; Rennie.,2000) [ 5 , 109 , 115 ],Memory based Learning (Sakkiset al,2000b) [ 117 ], Support vector machines (Druker et al, 1999) [ 36 ], Decision Trees (Carreras and Marquez, 2001) [ 19 ], Maximum Entropy Model (Berger et al 1996) [ 14 ], Hash Forest and a rule encoding method (T. Xia, 2020) [ 153 ], sometimes combining different learners (Sakkis et al, 2001) [ 116 ]. Using these approaches is better as classifier is learned from training data rather than making by hand.…”
Section: Nlp: Then and Nowmentioning
confidence: 99%
“…However, throughput is a challenging issue of RBSs, and their time complexity of filtering algorithms could not be reduced to an acceptable level. To address the throughput issue, a constant time complexity spam detection algorithm was developed by Xia [20].…”
Section: Rule-based Filtering Technologiesmentioning
confidence: 99%