Reconstruction of 3D space from visual data has always been a significant challenge in the field of computer vision. A popular approach to address this problem can be found in the form of bottom-up reconstruction techniques which try to model complex 3D scenes through a constellation of volumetric primitives. Such techniques are inspired by the current understanding of the human visual system and are, therefore, strongly related to the way humans process visual information, as suggested by recent visual neuroscience literature. While advances have been made in recent years in the area of 3D reconstruction, the problem remains challenging due to the many possible ways of representing 3D data, the ambiguity of determining the shape and general position in 3D space and the difficulty to train efficient models for the prediction of volumetric primitives. In this paper, we address these challenges and present a novel solution for recovering volumetric primitives from depth images. Specifically, we focus on the recovery of superquadrics, a special type of parametric models able to describe a wide array of 3D shapes using only a few parameters. We present a new learning objective that relies on the superquadric (inside-outside) function and develop two learning strategies for training convolutional neural networks (CNN) capable of predicting superquadric parameters. The first uses explicit supervision and penalizes the difference between the predicted and reference superquadric parameters. The second strategy uses implicit supervision and penalizes differences between the input depth images and depth images rendered from the predicted parameters. CNN predictors for superquadric parameters are trained with both strategies and evaluated on a large dataset of synthetic and real-world depth images. Experimental results show that both strategies compare favourably to the existing state-of-the-art and result in high quality 3D reconstructions of the modelled scenes at a much shorter processing time.