Computational neuroscience relies on gradient descent (GD) for training artificial neural network (ANN) models of the brain. The advantage of GD is that it is effective at learning difficult tasks. However, it produces ANNs that are a poor phenomenological fit to biology, making them less relevant as models of the brain. Specifically, it violates Dale's law, by allowing synapses to change from excitatory to inhibitory and leads to synaptic weights that are not log-normally distributed, contradicting experimental data. Here, starting from first principles of optimisation theory, we present an alternative learning algorithm, exponentiated gradient (EG), that respects Dale's Law and produces log-normal weights, without losing the power of learning with gradients. We also show that in biologically relevant settings EG outperforms GD, including learning from sparsely relevant signals and dealing with synaptic pruning. Altogether, our results show that EG is a superior learning algorithm for modelling the brain with ANNs.