2016
DOI: 10.1007/978-3-319-40667-1_8
|View full text |Cite
|
Sign up to set email alerts
|

On the Lack of Consensus in Anti-Virus Decisions: Metrics and Insights on Building Ground Truths of Android Malware

Abstract: Abstract. There is generally a lack of consensus in Antivirus (AV) engines' decisions on a given sample. This challenges the building of authoritative ground-truth datasets. Instead, researchers and practitioners may rely on unvalidated approaches to build their ground truth, e.g., by considering decisions from a selected set of Antivirus vendors or by setting up a threshold number of positive detections before classifying a sample.Both approaches are biased as they implicitly either decide on ranking AV produ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
38
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 37 publications
(41 citation statements)
references
References 22 publications
3
38
0
Order By: Relevance
“…We further perform a correlation study on these malware to assess whether the number of vulnerabilities in an apk can be correlated to the number of AVs that flag it. This is important since AVs are known to lack consensus among themselves [82], [83]. For every vulnerability type we found that the Spearman's ρ was below 0.30, implying negligible correlation.…”
Section: Vulnerability and Malwarementioning
confidence: 83%
“…We further perform a correlation study on these malware to assess whether the number of vulnerabilities in an apk can be correlated to the number of AVs that flag it. This is important since AVs are known to lack consensus among themselves [82], [83]. For every vulnerability type we found that the Spearman's ρ was below 0.30, implying negligible correlation.…”
Section: Vulnerability and Malwarementioning
confidence: 83%
“…Hurier et al [23] define a set of metrics to characterize the discrepancy between malware detectors while treating the voting result as the ground truth, namely the problem (i) mentioned above. In contrast, we differentiate the voting result from the ground truth, and aim to quantify the distance between the voting result and the unknown ground truth.…”
Section: Related Workmentioning
confidence: 99%
“…Second, the evaluation of systematic approaches is sensitive to some underlying assumptions which themselves are not grounded. For example, the choice of detection threshold and Anti-Virus engines can largely impact the characteristics of reference datasets [24]. Several flaws can also artificially improve the performance of detectors [2] or mislead the authors about the quality of their output [27].…”
Section: Related Workmentioning
confidence: 99%
“…Inconsistencies in Anti-Virus (AV) labels are indeed common. This is due to both naming disagreements [24], [25] across vendors and also a lack of adopted standards 1 for naming malware.…”
Section: Introductionmentioning
confidence: 99%