Tracking the pose of a robot has been gaining importance in the field of Robotics, e.g., paving the way for robot navigation. In recent years, monocular visual-inertial odometry (VIO) is widely used to do the pose estimation due to its good performance and low cost. However, VIO cannot estimate the scale or orientation accurately when robots move along straight lines or circular arcs on the ground. To address the problem, in this paper we take the wheel encoder into account, which can provide us with stable translation information as well as small accumulated errors and momentary slippage errors. By jointly considering the kinematic constraints and the planar moving features, an odometry algorithm tightly coupled with monocular camera, IMU, and wheel encoder is proposed to get robust and accurate pose sensing for mobile robots, which mainly contains three steps. First, we present the wheel encoder preintegration theory and noise propagation formula based on the kinematic mobile robot model, which is the basis of accurate estimation in backend optimization. Second, we adopt a robust initialization method to obtain good initial values of gyroscope bias and visual scale in reality, by making full use of the camera, IMU and wheel encoder measurements. Third, we bound the high computation complexity with a marginalization strategy that conditionally eliminates unnecessary measurements in the sliding window. We implement a prototype and several extensive experiments showing that our system can achieve robust and accurate pose estimation, in terms of the scale, orientation and location, compared with the state-of-the-art.