Event Detection Based on Nonnegative Matrix Factorization: Ceasefire Violation, Environmental, and Malware Events

Drake, Barry; Huang, Tianhe; Beavers, Ashley; Du, Rundong; Park, Haesun

doi:10.1007/978-3-319-60585-2_16

Cited by 3 publications

(3 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this section, we explain nonnegative matrix factorization (NMF) [26] in the topic modeling context [9,10] and our targeted topic modeling algorithm based on NMF with additional constraints.…”

Section: Targeted Topic Modelingmentioning

confidence: 99%

“…As the world becomes increasingly digital and huge amounts of text data are generated every minute, it becomes more challenging to discover useful information from them for applications such as situational awareness, patient phenotype discovery, event detection [9], or the onset of violence within a diverse population. More often than not, topics of interests are only implicitly covered in vast amounts of text data and the relevant data items are sparse and not immediately obvious.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

TopicSifter: Interactive Search Space Reduction through Targeted Topic Modeling

Kim

Choi

Drake

et al. 2019

2019 IEEE Conference on Visual Analytics Science and Technology (VAST)

Self Cite

View full text Add to dashboard Cite

Good-to-have:Bad-to-have: Figure 1: The TopicSifter system has (a) the control panel, (b) the main view, (c) the detail panel. The keyword module (d) in the control panel (a) shows the current set of good-to-have keywords, bad-to-have keywords, and stopwords and allows users to modify them. The system recommends additional keywords based on the current set of keywords. The main view (b) shows the sifting status bar (e) showing how many documents are retrieved from the total dataset and the topical overview (f) of current retrieved documents. The users can give positive or negative feedback on topics and documents to indicate relevancy. The detail panel (c) has two tab menus for showing document details and sifting history. ABSTRACTTopic modeling is commonly used to analyze and understand large document collections. However, in practice, users want to focus on specific aspects or "targets" rather than the entire corpus. For example, given a large collection of documents, users may want only a smaller subset which more closely aligns with their interests, tasks, and domains. In particular, our paper focuses on large-scale document retrieval with high recall where any missed relevant documents can be critical. A simple keyword matching search is generally not effective nor efficient as 1) it is difficult to find a list of keyword queries that can cover the documents of interest before exploring the dataset, 2) some documents may not contain the exact keywords of interest but may still be highly relevant, and 3) some words have * multiple meanings, which would result in irrelevant documents included in the retrieved subset. In this paper, we present TopicSifter, a visual analytics system for interactive search space reduction. Our system utilizes targeted topic modeling based on nonnegative matrix factorization and allows users to give relevance feedback in order to refine their target and guide the topic modeling to the most relevant results.

show abstract

Section: Targeted Topic Modelingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

TopicSifter: Interactive Search Space Reduction through Targeted Topic Modeling

Kim

Choi

Drake

et al. 2019

2019 IEEE Conference on Visual Analytics Science and Technology (VAST)

Self Cite

View full text Add to dashboard Cite

show abstract

“…NMF problem was first proposed in [39] as positive matrix factorization and popularized due to [32]. By now it has become a powerful tool for data dimensionality reduction and has found important applications in many fields such as clustering [10,29,30,36,14,18], data mining [41,50,13], signal processing [5], computer vision [2,21,17], bioinformatics [4,11,23], blind source separation [9], spectral data analysis [40], and many others. NMF problem (1.1) has been studied extensively and many numerical methods are currently available.…”

mentioning

confidence: 99%

An Alternating Rank-K Nonnegative Least Squares Framework (ARkNLS) for Nonnegative Matrix Factorization

Chu¹,

Shi²,

Eswar³

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

Nonnegative matrix factorization (NMF) is a prominent technique for data dimensionality reduction that has been widely used for text mining, computer vision, pattern discovery, and bioinformatics. In this paper, a framework called ARkNLS (Alternating Rank-k Nonnegativity constrained Least Squares) is proposed for computing NMF. First, a recursive formula for the solution of the rank-k nonnegativity-constrained least squares (NLS) is established. This recursive formula can be used to derive the closed-form solution for the Rank-k NLS problem for any integer k ≥ 1. As a result, each subproblem for an alternating rank-k nonnegative least squares framework can be obtained based on this closed form solution. Assuming that all matrices involved in rank-k NLS in the context of NMF computation are of full rank, two of the currently best NMF algorithms HALS (hierarchical alternating least squares) and ANLS-BPP (Alternating NLS based on Block Principal Pivoting) can be considered as special cases of ARkNLS with k = 1 and k = r for rank r NMF, respectively. This paper is then focused on the framework with k = 3, which leads to a new algorithm for NMF via the closed-form solution of the rank-3 NLS problem. Furthermore, a new strategy that efficiently overcomes the potential singularity problem in rank-3 NLS within the context of NMF computation is also presented. Extensive numerical comparisons using real and synthetic data sets demonstrate that the proposed algorithm provides state-of-the-art performance in terms of computational accuracy and cpu time.

show abstract