Recent advances in artificial intelligence have enhanced the abilities of mobile robots in dealing with complex and dynamic scenarios. However, to enable computationally intensive algorithms to be executed locally in multitask robots with low latency and high efficiency, innovations in computing hardware are required. Here, we report TianjicX, a neuromorphic computing hardware that can support true concurrent execution of multiple cross-computing-paradigm neural network (NN) models with various coordination manners for robotics. With spatiotemporal elasticity, TianjicX can support adaptive allocation of computing resources and scheduling of execution time for each task. Key to this approach is a high-level model, “Rivulet,” which bridges the gap between robotic-level requirements and hardware implementations. It abstracts the execution of NN tasks through distribution of static data and streaming of dynamic data to form the basic activity context, adopts time and space slices to achieve elastic resource allocation for each activity, and performs configurable hybrid synchronous-asynchronous grouping. Thereby, Rivulet is capable of supporting independent and interactive execution. Building on Rivulet with hardware design for realizing spatiotemporal elasticity, a 28-nanometer TianjicX neuromorphic chip with event-driven, high parallelism, low latency, and low power was developed. Using a single TianjicX chip and a specially developed compiler stack, we built a multi-intelligent-tasking mobile robot, Tianjicat, to perform a cat-and-mouse game. Multiple tasks, including sound recognition and tracking, object recognition, obstacle avoidance, and decision-making, can be concurrently executed. Compared with NVIDIA Jetson TX2, latency is substantially reduced by 79.09 times, and dynamic power is reduced by 50.66%.
Place recognition is an essential spatial intelligence capability for robots to understand and navigate the world. However, recognizing places in natural environments remains a challenging task for robots because of resource limitations and changing environments. In contrast, humans and animals can robustly and efficiently recognize hundreds of thousands of places in different conditions. Here, we report a brain-inspired general place recognition system, dubbed NeuroGPR, that enables robots to recognize places by mimicking the neural mechanism of multimodal sensing, encoding, and computing through a continuum of space and time. Our system consists of a multimodal hybrid neural network (MHNN) that encodes and integrates multimodal cues from both conventional and neuromorphic sensors. Specifically, to encode different sensory cues, we built various neural networks of spatial view cells, place cells, head direction cells, and time cells. To integrate these cues, we designed a multiscale liquid state machine that can process and fuse multimodal information effectively and asynchronously using diverse neuronal dynamics and bioinspired inhibitory circuits. We deployed the MHNN on Tianjic, a hybrid neuromorphic chip, and integrated it into a quadruped robot. Our results show that NeuroGPR achieves better performance compared with conventional and existing biologically inspired approaches, exhibiting robustness to diverse environmental uncertainty, including perceptual aliasing, motion blur, light, or weather changes. Running NeuroGPR as an overall multi–neural network workload on Tianjic showcases its advantages with 10.5 times lower latency and 43.6% lower power consumption than the commonly used mobile robot processor Jetson Xavier NX.
<p>The rapid development of deep learning has propelled many real-world artificial intelligence (AI) applications. Many of these applications integrate multiple neural network (multi-NN) models to cater to various functionalities. Although a number of multi-NN acceleration technologies have been explored, few can fully fulfill the flexibility and scalability required by emerging and diverse AI workloads, especially for mobile. Among these, homogeneous multi-core architectures have great potential to support multi-NN execution by leveraging decentralized parallelism and intrinsic scalability. However, the advantages of multi-core systems are underexploited due to the adoption of bulk synchronization parallelism (BSP), which is inefficient to meet the diversity of multi-NN workloads. This paper reports a hierarchical multi-core architecture with asynchronous parallelism to enhance multi-NN execution for higher performance and utilization. Hierarchical asynchronous parallel (HASP) is the theoretical foundation, which establishes a programmable and grouped dynamic synchronous-asynchronous framework for multi-NN acceleration. HASP can be implemented on a typical multi-core processor for multi-NN with minor modifications. We further developed a prototype chip to validate the hardware effectiveness of this design. A mapping strategy that combines spatial partitioning and temporal tuning is also developed, which allows the proposed architecture to promote resource utilization and throughput simultaneously.</p>
<p>The rapid development of deep learning has propelled many real-world artificial intelligence (AI) applications. Many of these applications integrate multiple neural network (multi-NN) models to cater to various functionalities. Although a number of multi-NN acceleration technologies have been explored, few can fully fulfill the flexibility and scalability required by emerging and diverse AI workloads, especially for mobile. Among these, homogeneous multi-core architectures have great potential to support multi-NN execution by leveraging decentralized parallelism and intrinsic scalability. However, the advantages of multi-core systems are underexploited due to the adoption of bulk synchronization parallelism (BSP), which is inefficient to meet the diversity of multi-NN workloads. This paper reports a hierarchical multi-core architecture with asynchronous parallelism to enhance multi-NN execution for higher performance and utilization. Hierarchical asynchronous parallel (HASP) is the theoretical foundation, which establishes a programmable and grouped dynamic synchronous-asynchronous framework for multi-NN acceleration. HASP can be implemented on a typical multi-core processor for multi-NN with minor modifications. We further developed a prototype chip to validate the hardware effectiveness of this design. A mapping strategy that combines spatial partitioning and temporal tuning is also developed, which allows the proposed architecture to promote resource utilization and throughput simultaneously.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.