We derive the optimal realizable Tomlinson-Harashima precoder for frequency-selective multiple-input multiple-output (MIMO) channels with respect to a simplified mean square error (MSE) criterion. Realizability means that, in contrast to other works, we do not restrict the internal filters of the precoder to have finite impulse responses (FIRs) but nevertheless ensure that the precoder can be operated in real-time subject to a finite latency time which can be chosen by the system designer. In particular, this allows us to consider channels with infinite impulse responses (IIRs) as they occur, e.g., in digital subscriber lines. The feedforward filter is located at the transmitter in our system model and an additional scalar gain is employed at the receiver. The relocation of the feedforward filter allows us to consider decentralized receivers. However, it also makes it necessary to impose a transmit power constraint. The power constraint often has the effect that the input signals at the receiver have to be rescaled. We argue that the inclusion of the additional scalar gain allows us to incorporate the effect of the rescaling operation into the optimization. Special emphasis is put on the fast computation of the realizable Tomlinson-Harashima precoder via displacement structure theory. We propose a fast algorithm which includes a successive construction of a close-to-optimal ordering of the data streams. The complexity of this algorithm is only cubic in the number of channel inputs and quadratic in the latency time. The algorithm can also be applied in situations where certain FIR constraints have to be fulfilled because we find that the optimal realizable Tomlinson-Harashima precoder involves only FIR filters as soon as the channel is FIR. Our results are reproducible.Index Terms-Fast algorithms, intersymbol interference, least mean squares algorithms, MIMO, precoding.