Many military functions such as intelligence collection or lessons learned analysis demand an understanding of situations derived from large quantities of written material. This paper describes approaches to gain greater understanding of document content by applying rule-based approaches in addition to open source machine learning models. The performance of two approaches to sentiment analysis are assessed, when operating on document sets from NATO sources. This combination enables analysts to identify items of interest within large document sets more effectively, by indicating the sentiment around specific aspects (nouns) which refer to a specific target (noun) in the text. This enables data science to give users a more detailed understanding of the content of large quantities of documents with respect to a particular target or subject.
As with many new disciplines, in many organisations data science is being embraced in a piecemeal way with many parts of organisations creating special purpose environments designed to answer specific problems, fragmenting the overall capacity and knowledge base. Often vendors selling proprietary approaches, potentially creating lock-in, fuel these isolated solutions. This article's main contribution is a 'Data Science as a Service (DSaaS)' model, where common elements required to conduct data science are abstracted and gathered into a logical layered, service-based architecture. This way, each element of the organisation can utilise the services it needs to progress its work, use specific solutions or share common tool sets, share results in a 'model zoo,' share data sets, share best practices and benefit from common, robust highperformance infrastructure and tools. With such an approach, it is possible to cluster data science skill sets and provide critical mass where needed. The proposed approach also facilitates a charge-back business model, where data science services are costed and charged to internal organisational elements or external customers in a measured, pay-as-you go way.
In this paper we address the approaches, techniques and results of applying machine learning techniques for cyber threat prediction. Timely discovery of advanced persistent threats is of utmost importance for the protection of NATO's and its allies' networks. Therefore, NATO and NATO Communication and Information Agency's Cyber Security service line is constantly looking for improvements. During Coalition Warrior Interoperability Exercise (CWIX) event data was captured on a Red-Blue Team Simulation. The data set was then used to apply a variety of Machine Learning techniques: deep-learning, auto-encoding and clustering with outliers.
This paper describes initial exploitation of Natural Language Processing (NLP) techniques applied to a specific set of related NATO documents. In particular, the text similarity technique was applied to document sets with the aim of capturing the relationships between documents or sections of documents from semantic and syntactic perspectives. Thesaurus and triple extraction techniques allowed the understanding of the sentences beyond the syntactic structure, thus improving the accuracy in capturing similar content across documents with diverse syntactic structures. The objective is to assess whether Natural Language Processing tools can retrieve relationships and gaps between such kinds of textual data. This work improves interoperability in NATO by enhancing the development and application of policies, directives and other documents, which dictate how Consultation, Command and Control (C3) systems across the Alliance interoperate and support NATO's operational needs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.