Gravity is a physical constraint all terrestrial species have adapted to through evolution. Indeed, gravity effects are taken into account in many forms of interaction with the environment, from the seemingly simple task of maintaining balance to the complex motor skills performed by athletes and dancers. Graviceptors, primarily located in the vestibular otolith organs, feed the Central Nervous System with information related to the gravity acceleration vector. This information is integrated with signals from semicircular canals, vision, and proprioception in an ensemble of interconnected brain areas, including the vestibular nuclei, cerebellum, thalamus, insula, retroinsula, parietal operculum, and temporo-parietal junction, in the so-called vestibular network. Classical views consider this stage of multisensory integration as instrumental to sort out conflicting and/or ambiguous information from the incoming sensory signals. However, there is compelling evidence that it also contributes to an internal representation of gravity effects based on prior experience with the environment. This a priori knowledge could be engaged by various types of information, including sensory signals like the visual ones, which lack a direct correspondence with physical gravity. Indeed, the retinal accelerations elicited by gravitational motion in a visual scene are not invariant, but scale with viewing distance. Moreover, the “visual” gravity vector may not be aligned with physical gravity, as when we watch a scene on a tilted monitor or in weightlessness. This review will discuss experimental evidence from behavioral, neuroimaging (connectomics, fMRI, TMS), and patients’ studies, supporting the idea that the internal model estimating the effects of gravity on visual objects is constructed by transforming the vestibular estimates of physical gravity, which are computed in the brainstem and cerebellum, into internalized estimates of virtual gravity, stored in the vestibular cortex. The integration of the internal model of gravity with visual and non-visual signals would take place at multiple levels in the cortex and might involve recurrent connections between early visual areas engaged in the analysis of spatio-temporal features of the visual stimuli and higher visual areas in temporo-parietal-insular regions.