Abstract-In this paper we present a comprehensive object categorization and classification system, of great importance for mobile manipulation applications in indoor environments. In detail, we tackle the problem of recognizing everyday objects that are useful for a personal robotic assistant in fulfilling its tasks, using a hierarchical multi-modal 3D-2D processing and classification system. The acquired 3D data is used to estimate geometric labels (plane, cylinder, edge, rim, sphere) at each voxel cell using the Radius-based Surface Descriptor (RSD). Then, we propose the use of a Global RSD feature (GRSD) to categorize point clusters that are geometrically identical into one of the object categories. Once a geometric category and a 3D position is obtained for each object cluster, we extract the region of interest in the camera image and compute a SURF-based feature vector for it. Thus we obtain the exact object instance and the orientation around the object's up-right axis from the appearance. The resultant system provides a hierarchical categorization of objects into basic classes from their geometry and identifies objects and their poses based on their appearance, with near real-time performance. We validate our approach on an extensive database of objects that we acquired using real sensing devices, and on both unseen views and unseen objects.