While numerous deep approaches to the problem of vision-aided localization have been recently proposed, systems operating in the real world will undoubtedly experience novel sensory states previously unseen even under the most prodigious training regimens. We address the localization problem with online error correction (OEC) modules that are trained to correct a vision-aided localization network's mistakes. We demonstrate the generalizability of the OEC modules and describe our unsupervised deep neural network approach to the fusion of RGB-D imagery with inertial measurements for absolute trajectory estimation. Our network, dubbed the Visual-Inertial-Odometry Learner (VIOLearner), learns to perform visual-inertial odometry (VIO) without inertial measurement unit (IMU) intrinsic parameters or the extrinsic calibration between an IMU and camera. The network learns to integrate IMU measurements and generate hypothesis trajectories which are then corrected online according to the Jacobians of scaled image projection errors with respect to spatial grids of pixel coordinates. We evaluate our network against state-of-the-art (SoA) VIO, visual odometry (VO), and visual simultaneous localization and mapping (VSLAM) approaches on the KITTI Odometry dataset as well as a micro aerial vehicle (MAV) dataset that we collected in the AirSim simulation environment. We demonstrate better than SoA translational localization performance against comparable SoA approaches on our evaluation sequences.