2020 19th RoEduNet Conference: Networking in Education and Research (RoEduNet) 2020
DOI: 10.1109/roedunet51892.2020.9324852
|View full text |Cite
|
Sign up to set email alerts
|

Early Detection of Vulnerabilities from News Websites using Machine Learning Models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
4

Relationship

1
7

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 15 publications
0
3
0
Order By: Relevance
“…Our aggregated corpus for training and evaluating our models consists of 1600 news articles from two collections of labeled articles that were obtained using two different approaches: Iorga et al (2020) introduce a corpus of 1000 news articles on cybersecurity manually labeled by experts from news outlets, and Iorga et al (2021) consider 600 more articles that were extracted from selected Tweeter accounts. The distribution in terms of length is displayed in Figure 1; as it can be observed, more than half of the articles exceed the usual length of 512 tokens acceptable by most pretrained language models.…”
Section: Corporamentioning
confidence: 99%
“…Our aggregated corpus for training and evaluating our models consists of 1600 news articles from two collections of labeled articles that were obtained using two different approaches: Iorga et al (2020) introduce a corpus of 1000 news articles on cybersecurity manually labeled by experts from news outlets, and Iorga et al (2021) consider 600 more articles that were extracted from selected Tweeter accounts. The distribution in terms of length is displayed in Figure 1; as it can be observed, more than half of the articles exceed the usual length of 512 tokens acceptable by most pretrained language models.…”
Section: Corporamentioning
confidence: 99%
“…Supervised classification has been used on tweets related to IT assets [4], tweets containing at least one security keyword from certain user accounts [2], tweets from cyber experts with n-grams extracted [13], and social media threats from multiple platforms [10]. [8].…”
Section: Related Workmentioning
confidence: 99%
“…considers using ML in predicting ever, the context of our work is not specific to SMEs, hence we focus the broader healthcare system with CVE database. There is another work[45] that illustrates using social media, news articles and open-source data to predict vulnerabilities in cybersecurity, using two ML models: Vector Machines and fine-tuned BERT. The result indicates that the model BERT performs better than Vector Machine.…”
mentioning
confidence: 99%