Drone racing is becoming a popular e-sport all over the world, and beating the best human drone race pilots has quickly become a new major challenge for artificial intelligence and robotics. In this paper, we propose a novel sensor fusion method called visual model-predictive localization (VML). Within a small time window, VML approximates the error between the model prediction position and the visual measurements as a linear function. Once the parameters of the function are estimated by the RANSAC algorithm, this error model can be used to compensate the prediction in the future. In this way, outliers can be handled efficiently and the vision delay can also be compensated efficiently. Theoretical analysis and simulation results show the clear advantage compared with Kalman filtering when dealing with the occasional large outliers and vision delays that occur in fast drone racing. Flight tests are performed on a tiny racing quadrotor named "Trashcan," which was equipped with a Jevois smart camera for a total of 72 g. An average speed of 2 m/s is achieved while the maximum speed is 2.6 m/s. To the best of our knowledge, this flying platform is currently the smallest autonomous racing drone in the world, while still being one of the fastest autonomous racing drones. K E Y W O R D S autonomous drone race, visual model-predictive localization