This paper presents the implementation and scaling of a neocortex inspired cognitive model on a Cray XD1. Both software and reconfigurable logic based FPGA implementations of the model are examined. This model belongs to a new class of biologically inspired cognitive models. Large scale versions of these models have the potential for significantly stronger inference capabilities than current conventional computing systems. These models have large amounts of parallelism and simple computations, thus allowing highly efficient hardware implementations. As a result, hardware-acceleration of these models can produce significant speedups over fully software implementations. Parallel software and hardware-accelerated implementations of such a model are investigated for networks of varying complexity. A scaling analysis of these networks is presented and utilized to estimate the throughput of both hardware-accelerated and software implementations of larger networks that utilize the full resources of the Cray XD1. Our results indicate that hardware-acceleration can provide average throughput gains of 75 times over software-only implementations of the networks we examined on this system.