Spiking neural network (SNN) distinguish themselves from artificial neural network (ANN) because of their inherent temporal processing and spike-based computations, enabling a power-efficient implementation in neuromorphic hardware. In this study, we demonstrate that data processing with spiking neurons can be enhanced by co-learning the synaptic weights with two other biologically inspired neuronal features: (1) a set of parameters describing neuronal adaptation processes and (2) synaptic propagation delays. The former allows a spiking neuron to learn how to specifically react to incoming spikes based on its past. The trained adaptation parameters result in neuronal heterogeneity, which leads to a greater variety in available spike patterns and is also found in the brain. The latter enables to learn to explicitly correlate spike trains that are temporally distanced. Synaptic delays reflect the time an action potential requires to travel from one neuron to another. We show that each of the co-learned features separately leads to an improvement over the baseline SNN and that the combination of both leads to state-of-the-art SNN results on all speech recognition datasets investigated with a simple 2-hidden layer feed-forward network. Our SNN outperforms the benchmark ANN on the neuromorphic datasets (Spiking Heidelberg Digits and Spiking Speech Commands), even with fewer trainable parameters. On the 35-class Google Speech Commands dataset, our SNN also outperforms a GRU of similar size. Our study presents brain-inspired improvements in SNN that enable them to excel over an equivalent ANN of similar size on tasks with rich temporal dynamics.