A coarse-grained Reconfigurable Processing Unit (RPU) consisting of 16×16 multi-functional Processing Elements (PEs) interconnected by an area-efficient Line-Switched Mesh Connect (LSMC) routing is implemented on a 5.4mm×3.1mm die in TSMC 65 nm LP1P8M CMOS technology. A Hierarchical Configuration Context (HCC) organization scheme is proposed to reduce the implementation overhead and the energy dissipation spent on fast reconfiguration. The proposed RPU is integrated into two System-on-a-Chips (SoCs), targeting multiple-standard video decoding. The high-performance chip, comprising two RPU processors (named REMUS_HPP), can decode 1920×1080 H.264 video streams at 30 frames per second (fps) under 200 MHz. REMUS_HPP achieves a 25% performance gain over the XPP-III reconfigurable processor with only 280 mW power consumption, resulting in a 14.3× improvement on energy efficiency. The other chip (named REMUS_LPP), targeting low power applications, integrates only one RPU processor. REMUS_LPP can decode 720×480 H.264 video streams at 35fps with 24.5 mW under 75 MHz, achieving a 76% reduction in power dissipation and a 3.96× improvement on energy efficiency compared with the ADRES reconfigurable processor.
IndexTerms-Coarse-grained reconfigurable array, reconfigurable computing, video decoding.
SUMMARYThis paper proposes a novel sub-architecture to optimize the data flow of REMUS-II (REconfigurable MUltimedia System 2), a dynamically coarse grain reconfigurable architecture. REMUS-II consists of a µPU (Micro-Processor Unit) and two RPUs (Reconfigurable Processor Unit), which are used to speeds up control-intensive tasks and dataintensive tasks respectively. The parallel computing capability and flexibility of REMUS-II makes itself an excellent candidate to process multimedia applications, which require a large amount of memory accesses. In this paper, we specifically optimize the data flow to deal with those performance-hazard and energy-hungry memory accessing in order to meet the bandwidth requirement of parallel computing. The RPU internal memory could work in multiple modes, like 2D-access mode and transformation mode, according to different multimedia access patterns. This novel design can improve the performance up to 26% compared to traditional on-chip memory. Meanwhile, the block buffer is implemented to optimize the offchip data flow through reducing off-chip memory accesses, which reducing up to 43% compared to direct DDR access. Based on RTL simulation, REMUS-II can achieve 1080p@30 fps of H.264 High Profile@ Level 4 and High Level MPEG2 at 200 MHz clock frequency. The REMUS-II is implemented into 23.7 mm 2 silicon on TSMC 65 nm logic process with a 400 MHz maximum working frequency.
Objective
To develop a deep learning algorithm for detection of active inflammatory sacroiliitis in short tau inversion recovery (STIR) sequence magnetic resonance imaging (MRI).
Methods
A total of 326 participants with axial spondyloarthritis (SpA), and 63 participants with non-specific back pain (NSBP) were recruited. STIR MRI of the SI joints was performed and clinical data were collected. Region of interests (ROIs) were drawn outlining bone marrow oedema, a reliable marker of active inflammation, which formed the ground truth masks from which “fake-colour” images were derived. Both the original and “fake-colour” images were randomly allocated into either the training and validation dataset or the testing dataset. Attention U-net was used for the development of deep learning algorithms. As comparison, an independent radiologist and rheumatologist blinded to the ground truth masks, were tasked with identifying bone marrow oedema in the MR images.
Results
Inflammatory sacroiliitis were identified in 1398 MR images from 228 participants. No inflammation was found in 3944 MR images from 161 participants. The mean sensitivity of algorithms derived from the original dataset and “fake-colour” image dataset were 0.86 ± 0.02, and 0.90 ± 0.01 respectively. The mean specificity of algorithms derived from the original and “fake-colour” image dataset were 0.92 ± 0.02, and 0.93 ± 0.01 respectively. The mean testing dice coefficients were 0.48 ± 0.27 for the original dataset and 0.51 ± 0.25 for the “fake-colour” image dataset. The area under the curve of the receiver operating characteristic (AUC-ROC) curve of the algorithms using original dataset and “fake-colour” image dataset were 0.92 and 0.96 respectively. Sensitivity and specificity of algorithms were comparable to interpretation by a radiologist, but outperformed the rheumatologist.
Conclusion
An MRI deep learning algorithm was developed for detection of inflammatory sacroiliitis in axial SpA.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.