Scaling analysis of a neocortex inspired cognitive model on the Cray XD1

Rice, Kenneth; Taha, Tarek M.; Vutsinas, Christopher N.

doi:10.1007/s11227-008-0195-z

Cited by 15 publications

(7 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These difficulties notwithstanding, there have been several machine learning techniques translated successfully to work in parallel on FPGAs, all reporting significant improvements over standard software implementations on conventional CPU architectures. This includes a neurologically inspired hierarchical bayesian model used for invariant object recognition [42], an implementation of the RankBoost for web search relevance ranking [48] and a low-precision implementation of Support Vector Machines using Sequential Minimal Optimisation [8].…”

Section: Single Machine Parallelismmentioning

confidence: 99%

Practical scalable image analysis and indexing using Hadoop

Hare

Samangooei

Lewis

2012

Multimed Tools Appl

View full text Add to dashboard Cite

The ability to handle very large amounts of image data is important for image analysis, indexing and retrieval applications. Sadly, in the literature, scalability aspects are often ignored or glanced over, especially with respect to the intricacies of actual implementation details.In this paper we present a case-study showing how a standard bag-of-visual-words image indexing pipeline can be scaled across a distributed cluster of machines. In order to achieve scalability, we investigate the optimal combination of hybridisations of the MapReduce distributed computational framework which allows the components of the analysis and indexing pipeline to be effectively mapped and run on modern server hardware. We then demonstrate the scalability of the approach practically with a set of image analysis and indexing tools built on top of the Apache Hadoop MapReduce framework. The tools used for our experiments are freely available as open-source software, and the paper fully describes the nuances of their implementation.

show abstract

Section: Single Machine Parallelismmentioning

confidence: 99%

Practical scalable image analysis and indexing using Hadoop

Hare

Samangooei

Lewis

2012

Multimed Tools Appl

View full text Add to dashboard Cite

show abstract

“…Rice et al [25] have proposed a neocortex-inspired cognitive model on the Cray XD1 supercomputer. The HTM, based on a hierarchical Bayesian network model proposed in [11], uses advanced software and reconfigurable hardware implementations to scale a model based on the human visual cortex to interesting problems.…”

Section: Related Workmentioning

confidence: 99%

“…The HTM, based on a hierarchical Bayesian network model proposed in [11], uses advanced software and reconfigurable hardware implementations to scale a model based on the human visual cortex to interesting problems. Like ourselves, Rice et al [25] take advantage of a massive amount of inherent parallelism in a model based on the neocortex. However, as described above, our implementation of a neocortex-inspired model does not use Bayesian inference.…”

Section: Related Workmentioning

confidence: 99%

Profiling Heterogeneous Multi-GPU Systems to Accelerate Cortically Inspired Learning Algorithms

Nere

Hashmi

Lipasti

2011

2011 IEEE International Parallel &Amp; Distributed Processing Symposium

View full text Add to dashboard Cite

Recent advances in neuroscientific understanding make parallel computing devices modeled after the human neocortex a plausible, attractive, fault-tolerant, and energyefficient possibility. Such attributes have once again sparked an interest in creating learning algorithms that aspire to reverseengineer many of the abilities of the brain.In this paper we describe a GPGPU-accelerated extension to an intelligent learning model inspired by the structural and functional properties of the mammalian neocortex. Our cortical network, like the brain, exhibits massive amounts of processing parallelism, making today's GPGPUs a highly attractive and readily-available hardware accelerator for such a model. Furthermore, we consider two inefficiencies inherent to our initial design: multiple kernel-launch overhead and poor utilization of GPGPU resources. We propose optimizations such as a software work-queue structure and pipelining the hierarchical layers of the cortical network to mitigate such problems. Our analysis provides important insight into the GPU architecture details including the number of cores, the memory system, and the global thread scheduler. Additionally, we create a runtime profiling tool for our parallel learning algorithm which proportionally distributes the cortical network across the host CPU as well as multiple GPUs, whether homogeneous or heterogeneous, that may be available to the system. Using the profiling tool with these optimizations on Nvidia's CUDA framework, we achieve up to 60x speedup over a single-threaded CPU implementation of the model.

show abstract

“…is still struggling to achieve system-level "general-purpose artificial intelligence" [12]. But recently, the computational neuroscience community has begun developing scalable Bayesian models (based on Bayesian framework [13]- [15]) that have the potential of being applied to large-scale applications, such as, speech recognition, computer vision, image content recognition, robotic control, and making sense of massive quantities of data [4], [16]. Some of these new algorithms are ideal candidates for largescale hardware investigation (and future implementation), especially if they can leverage the high-density processing/storage advantages of hybrid nanoelectronics [1], [3].…”

Section: Cmol/cmos Implementations Of Bayesian Polytreementioning

confidence: 99%

“…The recent work by Rice et al [16], [25] and Vutsinas et al [26] explores the combined implementations in regions 3 and 4, for the GHM [13]. At this time, we are not aware of any work on custom hardware implementation of the GHM [13].…”

Section: B Existing Hardware Implementations Of Ghmmentioning

confidence: 99%

CMOL/CMOS Implementations of Bayesian Polytree Inference: Digital and Mixed-Signal Architectures and Performance/Price

Zaveri

Hammerstrom

2010

IEEE Trans. Nanotechnology

View full text Add to dashboard Cite

In this paper, we focus on aspects of the hardware implementation of the Bayesian inference framework within the George and Hawkins' model. This framework is based on Judea Pearl's belief propagation. We then present a "hardware design space exploration" methodology for implementing and analyzing the (digital and mixed-signal) hardware for the Bayesian (polytree) inference framework. This, particular, methodology involves: analyzing the computational/operational cost and the related microarchitecture, exploring candidate hardware components, proposing various custom architectures using both traditional CMOS and hybrid nanotechnology CMOS/nanowire/molecular hybrid (CMOL), and investigating the baseline performance/price of these hardware architectures. The results suggest that hybrid nanotechnology is a promising candidate to implement Bayesian inference. Such implementations utilize the very high density storage/computation benefits of these new nanoscale technologies much more efficiently, for example, the throughput per 858 mm 2 obtained for CMOL-based architectures is 32-40 times better than the TPM for a CMOS based multiprocessor/multifield-programmable gate array system, and almost 2000 times better than the TPM for a single PC implementation. In general, the assessment of such hypothetical hardware architectures provides a baseline for large-scale implementations of Bayesian inference, and guidance for implementing the same using nanogrid structures.

show abstract

Scaling analysis of a neocortex inspired cognitive model on the Cray XD1

Cited by 15 publications

References 23 publications

Practical scalable image analysis and indexing using Hadoop

Practical scalable image analysis and indexing using Hadoop

Profiling Heterogeneous Multi-GPU Systems to Accelerate Cortically Inspired Learning Algorithms

CMOL/CMOS Implementations of Bayesian Polytree Inference: Digital and Mixed-Signal Architectures and Performance/Price

Contact Info

Product

Resources

About