This is a systematic review of over one hundred research papers about machine learning methods applied to defensive and offensive cybersecurity. In contrast to previous reviews, which focused on several fragments of research topics in this area, this paper systematically and comprehensively combines domain knowledge into a single review. Ultimately, this paper seeks to provide a base for researchers that wish to delve into the field of machine learning for cybersecurity. Our findings identify the frequently used machine learning methods within supervised, unsupervised, and semi-supervised machine learning, the most useful data sets for evaluating intrusion detection methods within supervised learning, and methods from machine learning that have shown promise in tackling various threats in defensive and offensive cybersecurity.
The term “Frequently asked questions” (FAQ) refers to a query that is asked repeatedly and produces a manually constructed response. It is one of the most important factors influencing customer repurchase and brand loyalty; thus, most industry domains invest heavily in it. This has led to deep-learning-based retrieval models being studied. However, training a model and creating a database specializing in each industry domain comes at a high cost, especially when using a chatbot-based conversation system, as a large amount of resources must be continuously input for the FAQ system’s maintenance. It is also difficult for small- and medium-sized companies and national institutions to build individualized training data and databases and obtain satisfactory results. As a result, based on the deep learning information retrieval module, we propose a method of returning responses to customer inquiries using only data that can be easily obtained from companies. We hybridize dense embedding and sparse embedding in this work to make it more robust in professional terms, and we propose new functions to adjust the weight ratio and scale the results returned by the two modules.
In this study, we qualitatively and quantitatively examine the effects of COVID-19 on classrooms, students, and educators. Using a new Twitter dataset specific to South Korea during the pandemic, we sample the sentiment and strain on students and educators using applied machine learning techniques in order to identify various topical pain points emerging during the pandemic. Our contributions include a novel and open source geo-fenced dataset on student and educator opinion within South Korea that we are making available to other researchers as well. We also identify trends in sentiment and polarity over the pandemic timeline, as well as key drivers behind the sentiments. Moreover, we provide a comparative analysis of two widely used pre-trained sentiment analysis approaches with TextBlob and VADER using statistical significance tests. Ultimately, we analyze how public opinion shifted on the pandemic in terms of positive sentiments about accessing course materials, online support communities, access to classes, and creativity, to negative sentiments about mental fatigue, job loss, student concerns, and overwhelmed institutions. We also initiate initial discussions about the concept of actionable sentiment analysis by overlapping polarity with the concept of trigger management to assist users in coping with negative emotions. We hope that insights from this preliminary study can promote further utilization of social media datasets to evaluate government messaging, population sentiment, and multi-dimensional analysis of pandemics.
Return on advertising spend (ROAS) refers to the ratio of revenue generated by advertising projects to its expense. It is used to assess the effectiveness of advertising marketing. Several simulation-based controlled experiments, such as geo experiments, have been proposed recently. This refers to calculating ROAS by dividing a geographic region into a control group and a treatment group and comparing the ROAS generated in each group. However, the data collected through these experiments can only be used to analyze previously constructed data, making it difficult to use in an inductive process that predicts future profits or costs. Furthermore, to obtain ROAS for each advertising group, data must be collected under a new experimental setting each time, suggesting that there is a limitation in using previously collected data. Considering these, we present a method for predicting ROAS that does not require controlled experiments in data acquisition and validates its effectiveness through comparative experiments. Specifically, we propose a task deposition method that divides the end-to-end prediction task into the two-stage process: occurrence prediction and occurred ROAS regression. Through comparative experiments, we reveal that these approaches can effectively deal with the advertising data, in which the label is mainly set to zero-label.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.