The widespread adoption of location‐aware devices is resulting in the generation of large amounts of spatiotemporal movement data, collected and stored in digital repositories. This forms a fertile ground for domain experts and scientists to analyze such historical data and discover interesting movement behavioral patterns. Experts in many domains, such as transportation, logistics and retail, are interested in detecting and understanding movement patterns and behavior of objects in relation to each other. Their insights can point to optimization potential and reveal deviations from planned behavior. In this paper, we focus on the detection of the encounter patterns as one possible type in movement behavior. These patterns refer to objects being close to one another in terms of space and time. We define scalability as a core requirement when dealing with historical movement data, in order to allow the domain expert to set parameters of the encounter detection algorithm. Our approach leverages a designated data structure and requires only a single pass over chronological data, thus resulting in highly scalable and fast technique to detect encounters. Consequently, users are able to explore their data by interactively specifying the spatial and temporal windows that define encounters. We evaluate our proposed method as a function of its input parameters and data size. We instantiate the proposed method on urban public transportation data, where we found a large number of encounters. We show that single encounters emerge into higher level patterns that are of particular interest and value to the domain.
As modern neural networks have grown to billions of parameters, meeting tight latency budgets has become increasingly challenging. Approaches like compression, sparsification and network pruning have proven effective to tackle this problembut they rely on modifications of the underlying network. In this paper, we look at a complimentary approach of optimizing how tensors are mapped to on-chip memory in an inference accelerator while leaving the network parameters untouched. Since different memory components trade off capacity for bandwidth differently, a sub-optimal mapping can result in high latency. We introduce evolutionary graph reinforcement learning (EGRL) -a method combining graph neural networks, reinforcement learning (RL) and evolutionary search -that aims to find the optimal mapping to minimize latency. Furthermore, a set of fast, stateless policies guide the evolutionary search to improve sample-efficiency. We train and validate our approach directly on the Intel NNP-I chip for inference using a batch size of 1. EGRL outperforms policy-gradient, evolutionary search and dynamic programming baselines on BERT, ResNet-101 and ResNet-50. We achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
Mixed-precision quantization is a powerful tool to enable memory and compute savings of neural network workloads by deploying different sets of bit-width precisions on separate compute operations. Recent research has shown significant progress in applying mixed-precision quantization techniques to reduce the memory footprint of various workloads, while also preserving task performance. Prior work, however, has often ignored additional objectives, such as bit-operations, that are important for deployment of workloads on hardware. Here we present a flexible and scalable framework for automated mixed-precision quantization that optimizes multiple objectives. Our framework relies on Neuroevolution-Enhanced Multi-Objective Optimization (NEMO), a novel search method, to find Pareto optimal mixed-precision configurations for memory and bit-operations objectives. Within NEMO, a population is divided into structurally distinct sub-populations (species) which jointly form the Pareto frontier of solutions for the multi-objective problem. At each generation, species are re-sized in proportion to the goodness of their contribution to the Pareto frontier. This allows NEMO to leverage established search techniques and neuroevolution methods to continually improve the goodness of the Pareto frontier. In our experiments we apply a graph-based representation to describe the underlying workload, enabling us to deploy graph neural networks trained by NEMO to find Pareto optimal configurations for various workloads trained on ImageNet. Compared to the state-of-the-art, we achieve competitive results on memory compression and superior results for compute compression for MobileNet-V2, ResNet50 and ResNeXt-101-32x8d, one of the largest ImageNet models amounting to a search space of ∼ 10 146 . A deeper analysis of the results obtained by NEMO also shows that both the graph representation and the speciesbased approach are critical in finding effective configurations for all workloads.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.