In this paper, we propose a novel global object descriptor, so called Viewpoint oriented Color-Shape Histogram (VCSH), which combines 3D object's color and shape features. The descriptor is efficiently used in a real-time textured/textureless object recognition and 6D pose estimation system, while also applied for object localization in a coherent semantic map. We build the object model firstly by registering from multi-view color point clouds, and generate partial-view object color point clouds from different synthetic viewpoints. Thereafter, the extracted color and shape features are correlated as a VCSH to represent the corresponding object patch data. For object recognition, the object can be identified and its initial pose is estimated through matching within our built database. Afterwards the object pose can be optimized by utilizing an iterative closest point strategy. Therefore, all the objects in the observed area are finally recognized and their corresponding accurate poses are retrieved. We validate our approach through a large number of experiments, including daily complex scenarios and indoor semantic mapping. Our method is proven to be efficient by guaranteeing high object recognition rate, accurate pose estimation result as well as exhibiting the capability of dealing with environmental illumination changes.