Abstract-In order to develop Vision-aided Inertial Navigation Systems (VINS) on mobile devices, such as cell phones and tablets, one needs to consider two important issues, both due to the commercial-grade underlying hardware: (i) The unknown and varying time offset between the camera and IMU clocks (ii) The rolling-shutter effect caused by CMOS sensors. Without appropriately modelling their effect and compensating for them online, the navigation accuracy will significantly degrade. In this work, we introduce a linear-complexity algorithm for fusing inertial measurements with time-misaligned, rolling-shutter images using a highly efficient and precise linear interpolation model. As a result, our algorithm achieves a better accuracy and improved speed compared to existing methods. Finally, we validate the superiority of the proposed algorithm through simulations and real-time, online experiments on a cell phone.