Neuronal coding and memory formation depend on temporal activation patterns spanning high-dimensional ensembles of neurons. To characterize these high-dimensional spike sequences, it is critical to measure their dissimilarity across different epochs (e.g. stimuli, brain states) in terms of all the relative spike-timing relationships. Such a dissimilarity measure can then be used for dimensionality reduction, clustering of sequences or decoding. Here, we present a new measure of dissimilarity between multi-neuron spike sequences based on optimal transport theory, called SpikeShip. SpikeShip computes the optimal transport cost (Earth Mover's Distance) to make all the relative spike-timing relationships (across neurons) identical between two spiking patterns. It achieves this by computing the optimal transport of spikes across time, per neuron separately, and then decomposing this transport cost in a temporal rigid translation term and a vector of neuron-specific transport flows. This yields a suitable geometry for spike sequences. SpikeShip can be effectively computed for high-dimensional neuronal ensembles and has linear O(N ) cost. Furthermore, it is explicitly based on the higher-order structure in the spiking patterns. SpikeShip opens new avenues for studying neural coding and memory consolidation by finding patterns in high-dimensional neural ensembles.June 4, 2020 1/15 spiking patterns and sensory or internal variables. They also pose unique mathematical challenges and opportunities for the unsupervised discovery of the "dictionary" of neuronal "code-words". For instance, how do we measure the similarity between two multi-neuron spiking patterns in terms of their temporal structure (spike sequence)?In general, any notion of information encoding relies on the construction of a distance or dissimilarity measure in an N-dimensional space. For example, the distance between binary strings can be measured using the Hamming distance, which is an essential mathematical construct in error-correcting coding. In the brain, the distance between two multi-neuron spiking patterns is conventionally measured in two steps: (1) By computing the number of spikes/sec (firing rates) for each neuron in a certain time window; and (2) computing e.g. the angle or Euclidean distance between multi-neuron rate vectors. Using this method, it has been shown for example that high-dimensional neural ensembles span a low-dimensional manifold that relates to stimulus or behavioral variables in a meaningful way [9,10]. However, the computation of firing rates disregards potentially rich information contained by the precise temporal order in which spikes are fired by multiple neurons (e.g. neuron i firing at time t and neuron j firing at t + τ ). For instance, we expect that any time-varying sensory stimulus (motion) or action sequence may be encoded by a unique multi-neuron temporal pattern of spiking. It has been shown that multi-neuron temporal sequences encode information about sensory stimuli or are required for the generation of complex m...