The brain is believed to implement probabilistic reasoning and to represent information via population, or distributed, coding. Most previous population-based probabilistic (PPC) theories share several basic properties: 1) continuous-valued neurons (units); 2) fully/densely-distributed codes, i.e., all/most coding units participate in every code; 3) graded synapses; 4) rate coding; 5) units have innate unimodal, e.g., bellshaped, tuning functions (TFs); 6) units are intrinsically noisy; and 7) noise/correlation is generally considered harmful. We present a radically different theory that assumes: 1) binary units; 2) only a small subset of units, i.e., a sparse distributed code (SDC) (a.k.a. cell assembly, ensemble), comprises any individual code; 3) binary synapses; 4) signaling formally requires only single, i.e., first, spikes; 5) units initially have completely flat TFs (all weights zero); 6) units are not inherently noisy; but rather 7) noise is a resource generated/used to cause similar inputs to map to similar codes, controlling a tradeoff between storage capacity and embedding the input space statistics in the pattern of intersections over stored codes, indirectly yielding correlation patterns. The theory, Sparsey, was introduced 20 years ago as a canonical cortical circuit/algorithm model of efficient, generic spatiotemporal pattern learning/recognition, but it was not elaborated as an alternative to PPC-type theories. Here, we provide simulation results showing that the active SDC simultaneously represents not only the spatially/spatiotemporally most similar/likely input but the coarsely-ranked similarity/likelihood distribution over all stored inputs (hypotheses). Crucially, Sparsey's code selection algorithm (CSA), used for both learning and inference, achieves this with a single pass over the weights for each successive item of a sequence, thus performing learning and probabilistic inference for spatiotemporal patterns with a number of steps that remains constant for the life of the system, i.e., as the number of stored items increases. We also discuss our approach as a radically new implementation of graphical probability modeling.