2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference On 2018
DOI: 10.1109/hpcc/smartcity/dss.2018.00252
|View full text |Cite
|
Sign up to set email alerts
|

Online Web Bot Detection Using a Sequential Classification Approach

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(13 citation statements)
references
References 20 publications
0
13
0
Order By: Relevance
“…Web bot detection aims to accurately distinguish whether a web visitor is a bot or a human. This categorisation may entail simply distinguishing web bots from human visitors [3,4,10,12,18,34] or further categorising web bots based on their functionality [17], purpose [6,32,43], or complexity (i.e., simple vs. sophisticated) and based on whether they try to evade detection or not [20].…”
Section: Background and Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Web bot detection aims to accurately distinguish whether a web visitor is a bot or a human. This categorisation may entail simply distinguishing web bots from human visitors [3,4,10,12,18,34] or further categorising web bots based on their functionality [17], purpose [6,32,43], or complexity (i.e., simple vs. sophisticated) and based on whether they try to evade detection or not [20].…”
Section: Background and Related Workmentioning
confidence: 99%
“…The web bot detection approaches that are based on web logs rely primarily on several "traditional" machine learning algorithms, such as Support Vector Machines [20,29,37], Random Forests [20,34], Adaboost [20,34], and Multi-layer Perceptron classifiers [10,20,29,37]. Initially, the sessions of each visitor of the web server are extracted from the web logs [7,20,26,30,33,34].…”
Section: Background and Related Workmentioning
confidence: 99%
“…In both cases, a ground truth of human and Web bot sessions is required. In most recent research, the annotation process relies on comparing each visitor's agent name [26] and IP address [5,7,13,22,25] with the agent names and IPs of known web bots according to lists hosted on external servers. Such lists mostly contain identifiers for bots which are benign in nature, like, for example, search engine bots, although some malicious bots can be found there as well.…”
Section: Background and Related Workmentioning
confidence: 99%
“…To evaluate the framework, we split the dataset into two sets, one for training and one for testing. Our Web server by default splits the HTTP log data into files based on a log rotation technique 7 . The total files that were generated over a year were 13.…”
Section: Datasetmentioning
confidence: 99%
See 1 more Smart Citation