Advances in data collection and storage capacity have made it increasingly possible to collect highly volatile graph data for analysis. Existing graph analysis techniques are not appropriate for such data, especially in cases where streaming or near-real-time results are required. An example that has drawn significant research interest is the cyber-security domain, where internet communication traces are collected and real-time discovery of events, behaviors, patterns and anomalies is desired. We propose Met-ricForensics, a scalable framework for analysis of volatile graphs. MetricForensics combines a multi-level "drill down" approach, a collection of user-selected graph metrics and a collection of analysis techniques. At each successive level, more sophisticated metrics are computed and the graph is viewed at a finer temporal resolution. In this way, MetricForensics scales to highly volatile graphs by only allocating resources for computationally expensive analysis when an interesting event is discovered at a coarser resolution first. We test MetricForensics on three real-world graphs: an enterprise IP trace, a trace of legitimate and malicious network traffic from a research institution, and the MIT Reality Mining proximity sensor data. Our largest graph has ∼3M vertices and ∼32M edges, spanning 4.5 days. The results demonstrate the scalability and capability of MetricForensics in analyzing volatile graphs; and highlight four novel phenomena in such graphs: elbows, broken correlations, prolonged spikes, and strange stars.
The 15 th ACM International Conference on Web Search and Data Mining (WSDM 2022) was successfully held on 2/21 - 2/25/2022 online. It was originally planned to be held in Tempe, Arizona and changed to a virtual format due to the surge of Omicron. It was a five-day event with the middle 3 days dedicated to keynote speeches, single-session presentations, poster sessions, and sponsor talks. This brief report provides an overview of WSDM 2022 with organization and program details and statistics from Conference Chairs and Program Committee Chairs, as well as a message from the WSDM Steering Committee Chair to call for future WSDM conferences in the US (2024) and Europe (2025). Further details can be found at https://www.wsdm-conference.org/2022/. At the end of this report, we express our deep gratitude to authors, volunteers, special interest groups (SIGs), steering committee, and sponsors.
No abstract
Pretrained Large Language Models (LLMs) have gained significant attention for addressing open-domain Question Answering (QA). While they exhibit high accuracy in answering questions related to common knowledge, LLMs encounter difficulties in learning about uncommon long-tail knowledge (tail entities). Since manually constructing QA datasets demands substantial human resources, the types of existing QA datasets are limited, leaving us with a scarcity of datasets to study the performance of LLMs on tail entities. In this paper, we propose an automatic approach to generate specialized QA datasets for tail entities and present the associated research challenges. We conduct extensive experiments by employing pretrained LLMs on our newly generated long-tail QA datasets, comparing their performance with and without external resources including Wikipedia and Wikidata knowledge graphs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.