Accurate indoor localization is a crucial enabling technology for many robotics applications, from warehouse management to monitoring tasks. Ultra-wideband (UWB) time difference of arrival (TDOA)-based localization is a promising lightweight, low-cost solution that can scale to a large number of devices-making it especially suited for resource-constrained multi-robot applications. However, the localization accuracy of standard, commercially available UWB radios is often insufficient due to significant measurement bias and outliers. In this letter, we address these issues by proposing a robust UWB TDOA localization framework comprising of (i) learning-based bias correction and (ii) M-estimation-based robust filtering to handle outliers. The key properties of our approach are that (i) the learned biases generalize to different UWB anchor setups and (ii) the approach is computationally efficient enough to run on resource-constrained hardware. We demonstrate our approach on a Crazyflie nano-quadcopter. Experimental results show that the proposed localization framework, relying only on the onboard IMU and UWB, provides an average of 42.08% localization error reduction (in three different anchor setups) compared to the baseline approach without bias compensation. We also show autonomous trajectory tracking on a quadcopter using our UWB TDOA localization approach.