A standard circuit motif in sensory systems is the pooling of sensory information from an upstream neuronal layer. A downstream neuron thereby collects signals across different locations in stimulus space, which together compose the neuron's receptive field. In addition, nonlinear transformations in the signal transfer between the layers give rise to functional subunits inside the receptive field. For ganglion cells in the vertebrate retina, for example, receptive field subunits are thought to correspond to presynaptic bipolar cells. Identifying the number and locations of subunits from the stimulus-response relationship of a recorded ganglion cell has been an ongoing challenge in order to characterize the retina's functional circuitry and to build computational models that capture nonlinear signal pooling. Here we present a novel version of spike-triggered non-negative matrix factorization (STNMF), which can extract localized subunits in ganglion-cell receptive fields from recorded spiking responses under spatiotemporal white-noise stimulation. The method provides a more than 100-fold speed increase compared to a previous implementation, which can be harnessed for systematic screening of hyperparameters, such as sparsity regularization. We demonstrate the power and flexibility of this approach by analyzing populations of ganglion cells from salamander and primate retina. We find that subunits of midget as well as parasol ganglion cells in the marmoset retina form separate mosaics that tile visual space. Moreover, subunit mosaics show alignment with each other for ON and OFF midget as well as for ON and OFF parasol cells, indicating a spatial coordination of ON and OFF signals at the bipolar-cell level. Thus, STNMF can reveal organizational principles of signal transmission between successive neural layers, which are not easily accessible by other means.