2022
DOI: 10.24251/hicss.2022.280
|View full text |Cite
|
Sign up to set email alerts
|

TSM: Measuring the Enticement of Honeyfiles with Natural Language Processing

Abstract: Honeyfile deployment is a useful breach detection method in cyber deception that can also inform defenders about the intent and interests of intruders and malicious insiders. A key property of a honeyfile, enticement, is the extent to which the file can attract an intruder to interact with it. We introduce a novel metric, Topic Semantic Matching (TSM), which uses topic modelling to represent files in the repository and semantic matching in an embedding vector space to compare honeyfile text and topic words rob… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 33 publications
(53 reference statements)
0
2
0
Order By: Relevance
“…Earlier work on enticement [24] measured common word counts as a basis for a metric. Since this approach does not account for paraphrasing, we have introduced a measure called the Topic Semantic Matching (TSM) enticement score [41]. TSM extracts the main topics of a repository, also know as local context, using topic modelling.…”
Section: A Creating Documentsmentioning
confidence: 99%
See 1 more Smart Citation
“…Earlier work on enticement [24] measured common word counts as a basis for a metric. Since this approach does not account for paraphrasing, we have introduced a measure called the Topic Semantic Matching (TSM) enticement score [41]. TSM extracts the main topics of a repository, also know as local context, using topic modelling.…”
Section: A Creating Documentsmentioning
confidence: 99%
“…Development of the SCHEMADB dataset -a collection of relational schema in both MySQL and heterogeneous graph form[60].• Development of a honeyfile corpus for use in experiments on measuring the enticement of honeyfiles[41].…”
mentioning
confidence: 99%