Memory-Constrained Vectorization and Scheduling of Dataflow Graphs for Hybrid CPU-GPU Platforms

Xie

Front. Comput. Neurosci.

et al. 2020

Self Cite

Real-time neuron detection and neural activity extraction are critical components of real-time neural decoding. In this paper, we propose a novel real-time neuron detection and activity extraction system using a dataflow framework to provide real-time performance and adaptability to new algorithms and hardware platforms. The proposed system was evaluated on simulated calcium imaging data, calcium imaging data with manual annotation, and calcium imaging data of the anterior lateral motor cortex. We found that the proposed system accurately detected neurons and extracted neural activities in real time without any requirement for expensive, cumbersome, or special-purpose computing hardware. We expect that the system will enable cost-effective, real-time calcium imaging-based neural decoding, leading to precise neuromodulation.

Section: Background and Related Workmentioning

confidence: 99%

Real-Time Neuron Detection and Neural Signal Extraction Platform for Miniature Calcium Imaging

Xie

Front. Comput. Neurosci.

et al. 2020

Self Cite

Lecture Notes in Computer Science

“…Many tools are able to analyze SDF graphs, to derive various properties (e.g. mapping and buffer size), and finally to generate the glue code of the schedule automatically: for example, DIF-GPU [14], PREESM [15], MAPS [16], Diplomat [17], Gaspard [18], PeaCE [19], and Ptolemy [20]. But these tools either do not jointly consider real-time execution and FPP scheduling, or do not perform all syntheses automatically.…”

Section: Related Workmentioning

confidence: 99%

A Framework for Fixed Priority Periodic Scheduling Synthesis from Synchronous Data-Flow Graphs

Tran

Honorat²,

Bhattacharyya³

et al. 2022

Self Cite

Synchronous data-flow graphs (SDF) are widely used in the design of concurrent real-time digital signal processing applications on multiprocessor system-on-chip. The increasing complexity of these hardware platforms advocates the use of real-time operating systems and fixed-priority scheduling to manage applications and resources. This trend calls for new methods to synthesize and implement actors in SDF graphs as real-time tasks with computed scheduling parameters (periods, priorities, processor mapping, etc.). This article presents a framework supporting scheduling synthesis, scheduling simulation, and code generation of these graphs. The scheduling synthesis maps each actor to a periodic real-time task and computes the appropriate buffer sizes and scheduling parameters. The results are verified by a scheduling simulator and instantiated by a code generator targeting the RTEMS (Real-Time Executive for Multiprocessor Systems) operating system. Experiments are conducted to evaluate the framework's performance and scalability as well as the overhead induced by the code generator.

EURASIP J. Adv. Signal Process.

“…In OpenCL terminology, the vectorization degree is commonly referred to as the number of global work items. Careful optimization of vectorization degrees can have major performance benefit for GPU acceleration of dataflow graphs [19].…”

Section: Throughput Optimizationmentioning

confidence: 99%

Optimized implementation of digital signal processing applications with gapless data acquisition

Liu

Barford

Bhattacharyya

2019

Self Cite

This paper presents novel models and design optimization methods for gapless deep waveform applications, where continuous streams of data must be processed reliably without dropping any samples. The approaches developed in this paper involve unified dataflow-based modeling of the interfaces and signal processing functionality of gapless deep waveform analysis. Bottleneck actors (computational modules) in the resulting dataflow model are then identified and tackled with approximate computing techniques. These techniques are developed and configured carefully so that large performance gains are achieved while keeping reductions in signal processing accuracy to a manageable level. Efficient actor-and graph-level code optimization techniques are also applied to further improve real-time performance. In addition to providing accurate, real-time processing on the experimental platform used in our experiments, the algorithm-and model-based formulation of the contributions in this part promotes their general utility in deep waveform analysis and their retargetability to other platforms.