2019 IEEE International Conference on Big Data (Big Data) 2019
DOI: 10.1109/bigdata47090.2019.9006444
|View full text |Cite
|
Sign up to set email alerts
|

Extracting Rich Semantic Information about Cybersecurity Events

Abstract: Articles about cybersecurity events like data breaches and ransomware attacks are common, both in general news and technical sources. Automatically extracting structured information from these can provide valuable information to inform both human analysts and computer systems. In this paper we describe how cybersecurity events can be described via semantic schemas, examined through an initial set of five event types. Using a collection of 1,000 news articles annotated with these event types, including their se… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 25 publications
0
4
0
Order By: Relevance
“…Additional details can be found in Satyapanich (2019b) and Satyapanich, Finin, and Ferraro (2019). We make our corpus, annotations and code publicly available (Satyapanich 2019a).…”
Section: Discussionmentioning
confidence: 99%
“…Additional details can be found in Satyapanich (2019b) and Satyapanich, Finin, and Ferraro (2019). We make our corpus, annotations and code publicly available (Satyapanich 2019a).…”
Section: Discussionmentioning
confidence: 99%
“…The groundtruth of training data for establishing the neural network model derives from 1000 articles that discuss the five types of events mentioned in the definition of event nugget. The 1000 articles were published by the project [29], each of which was labeled by three experienced computer scientists and a majority vote was adopted to decide the final annotation.…”
Section: Information Extractionmentioning
confidence: 99%
“…Three datasets are utilized in our study to establish the search engine prototype: 1000 security news articles [29] that mentioned five security events and annotated by experienced security experts; vulnerability archives collected from authoritative vulnerability database [39]; tweets from 2015 to 2020 that mentioned security keywords using a Python package called Twitterscraper [40]. During the implementation phase, the parameters adopted are listed as follows:…”
Section: Datasets and Settingsmentioning
confidence: 99%
“…Finally, ED has been extensively studied in the literature (Liao and Grishman, 2010;Li et al, 2013;Grishman, 2015, 2016e;Chen et al, 2015;Nguyen et al, 2016g;Lu and Nguyen, 2018;Liu et al, 2016bLiu et al, , 2017Hong et al, 2018;Lai et al, 2020b), partly due to the availability of the large evaluation datasets (i.e., the ACE and TAC KBP datasets (Walker et al, 2006;Mitamura et al, 2015) for the general domains, and the BioNLP datasets (Kim et al, 2009) for the biomedical domain). The closest works to our in the cybersecurity domain involve (Qiu et al, 2016) to extract events on Chinese news, (Khandpur et al, 2017) to perform cyberattack detection on Twitter, and (Satyapanich et al, 2019;Satyapanich et al, 2020) to present the CASIE dataset for event extraction. However, these datasets contain less event types and cannot support the document-level information for the models as CySecED.…”
Section: Related Workmentioning
confidence: 99%