2020
DOI: 10.3390/fi12010012
|View full text |Cite
|
Sign up to set email alerts
|

Mitigating Webshell Attacks through Machine Learning Techniques

Abstract: A webshell is a command execution environment in the form of web pages. It is often used by attackers as a backdoor tool for web server operations. Accurately detecting webshells is of great significance to web server protection. Most security products detect webshells based on feature-matching methods—matching input scripts against pre-built malicious code collections. The feature-matching method has a low detection rate for obfuscated webshells. However, with the help of machine learning algorithms, webshell… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(14 citation statements)
references
References 11 publications
0
13
0
1
Order By: Relevance
“…Crawling session must not consist any offensive request while scanning session must consist at least one offensive request. Adjustments of number of requests and time gap is based on the gathered [23] .git scanner_env_file [24] .env scanner_nmap [25] nmaplowercheck scanner_voip_yealink [26] y000000000000.cfg, /prov scanner_voip_asterisk /servlet scanner_ncsi [27] ncsi.txt scanner_sntp [28] /html/sntp.html scanner_horde [29] /imp/test.php scanner_weblogic_oracle [30] bea_wls_deployment_internal scanner_pma [31] phpmyadmin, pma, phpma scanner_wp [32] wp-, xmlrpc, plugins, wordpress, /wp/ scanner_drupal drupal scanner_cgi [33] cgi-bin, cgi scanner_mysql mysql scanner_sqlite sqlite scanner_jboss [34] .jsp scanner_sql sql scanner_hnap [35] hnap1 scanner_webdav webdav scanner_login login, admin scanner_webshell [36] .php data. We are aware that one IP address can be shared between many clients (networks behind a NAT or many applications working parallel or in a chain) so one address is not always corresponding to one client.…”
Section: The Methodology Of Data Analysismentioning
confidence: 99%
“…Crawling session must not consist any offensive request while scanning session must consist at least one offensive request. Adjustments of number of requests and time gap is based on the gathered [23] .git scanner_env_file [24] .env scanner_nmap [25] nmaplowercheck scanner_voip_yealink [26] y000000000000.cfg, /prov scanner_voip_asterisk /servlet scanner_ncsi [27] ncsi.txt scanner_sntp [28] /html/sntp.html scanner_horde [29] /imp/test.php scanner_weblogic_oracle [30] bea_wls_deployment_internal scanner_pma [31] phpmyadmin, pma, phpma scanner_wp [32] wp-, xmlrpc, plugins, wordpress, /wp/ scanner_drupal drupal scanner_cgi [33] cgi-bin, cgi scanner_mysql mysql scanner_sqlite sqlite scanner_jboss [34] .jsp scanner_sql sql scanner_hnap [35] hnap1 scanner_webdav webdav scanner_login login, admin scanner_webshell [36] .php data. We are aware that one IP address can be shared between many clients (networks behind a NAT or many applications working parallel or in a chain) so one address is not always corresponding to one client.…”
Section: The Methodology Of Data Analysismentioning
confidence: 99%
“…At the same time, the feature dimension used is higher, and the internal correlation of similar features is also greater. erefore, compared with the Naive Bayes used by Guo et al [38], Random Forest is more suitable for the sample scenario in this paper. Simultaneously, the bi-gram only expresses the relationship between the adjacent opcodes, and the opcode to express a complete sentence of Python language needs five or more, so the n � 5 used in this paper can better represent the semantic information of the text.…”
Section: Comparative Experimentmentioning
confidence: 95%
“…Unlike this paper, TF-IDF represents the frequency of a single character. Guo et al [38] recognized webshell attacks through opcode and also used TF-IDF to represent text. However, the bi-gram was used to divide characters, and the final classifier chose Naive Bayes (the method of the paper below is represented by the author's last name).…”
Section: Comparative Experimentmentioning
confidence: 99%
See 1 more Smart Citation
“…In addition, due to the constant evolution and iteration of code obfuscation and code encryption techniques, webshells can easily bypass regular methods, which are based on regular expressions. Moreover, the static feature detection method has no way to conduct interprocedural analysis, that is to detect the included files and user-defined dangerous function, so the detection method is based on the feature code and syntax analysis, and the dangerous function [16] name matching can be easily bypassed.…”
Section: Introductionmentioning
confidence: 99%