Proceedings of the 12th International Workshop on Semantic Evaluation 2018
DOI: 10.18653/v1/s18-1142
|View full text |Cite
|
Sign up to set email alerts
|

UMBC at SemEval-2018 Task 8: Understanding Text about Malware

Abstract: We describe the systems developed by the UMBC team for 2018 SemEval Task 8, Se-cureNLP (Semantic Extraction from Cyberse-cUrity REports using Natural Language Processing). We participated in three of the subtasks: (1) classifying sentences as being relevant or irrelevant to malware, (2) predicting token labels for sentences, and (4) predicting attribute labels from the Malware Attribute Enumeration and Characterization vocabulary for defining malware characteristics. We achieved F1 scores of 50.34/18.0 (dev/te… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 9 publications
(5 citation statements)
references
References 10 publications
0
5
0
Order By: Relevance
“…The input of our dataset was in a text format with positive and negative labels. Therefore, we chose four baselines that are capable of handling text dataset: NBC [ 33 ], SVM [ 34 ], MLP [ 35 ], and CNN n-gram [ 36 ]. NBC and SVM are considered as traditional machine learning algorithms.…”
Section: Methodsmentioning
confidence: 99%
“…The input of our dataset was in a text format with positive and negative labels. Therefore, we chose four baselines that are capable of handling text dataset: NBC [ 33 ], SVM [ 34 ], MLP [ 35 ], and CNN n-gram [ 36 ]. NBC and SVM are considered as traditional machine learning algorithms.…”
Section: Methodsmentioning
confidence: 99%
“…The vocabulary size of the Domain-word2vec embedding was 28,283 words. The Cyber-Word2vec embeddings were produced by Padia et al (2018). It has 100dimension which trained on a large corpus of approximately one million cybersecurity-related webpages.…”
Section: Event Nugget and Event Argument Detectionmentioning
confidence: 99%
“…TeamDL [15] built a convolutional neural network with original glove embeddings. UMBC [16] used a Multilayer Perceptron model for the submission of SubTask 1. Inspired from the tasks, Ravikiran [17] has proposed a multimodal dataset with QR-codes and Malware Text classification.…”
Section: Related Workmentioning
confidence: 99%