This paper analyzes the observability properties of the visual inertial structure from motion as the number of inertial sensors is reduced. Specifically, instead of considering the standard formulation where the inertial sensors are 3 orthogonal accelerometers and 3 orthogonal gyroscopes, the sensor system here considered only consists of a monocular camera and 1 or 2 accelerometers. This analysis has never been provided before. The main result achieved in this context is that the observability properties of visual inertial structure from motion do not change by removing all the 3 gyroscopes and 1 accelerometer. By removing a further accelerometer, if the camera is not extrinsically calibrated, the system loses part of its observability properties. On the other hand, if the camera is extrinsically calibrated, the system maintains the same observability properties as in the standard case. This contribution clearly shows that the information provided by a monocular camera, 3 accelerometers and 3 gyroscopes is redundant. Additionally, it provides a new perspective in the framework of neuroscience to the process of vestibular and visual integration for depth perception and self motion perception. Finally, to analyze these systems with a reduced number of inertial sensors, the paper introduces a new method to derive the observability properties of a non linear system when part of its input controls is unknown. This method is a further original paper contribution in control theory.