This paper proposes a novel algorithm for detecting road scene objects (e.g., light poles, traffic signposts, and cars) from 3-D mobile-laser-scanning point cloud data for transportationrelated applications. To describe local abstract features of point cloud objects, a contextual visual vocabulary is generated by integrating spatial contextual information of feature regions. Objects of interest are detected based on the similarity measures of the bag of contextual-visual words between the query object and the segmented semantic objects. Quantitative evaluations on two selected data sets show that the proposed algorithm achieves an average recall, precision, quality, and F-score of 0.949, 0.970, 0.922, and 0.959, respectively, in detecting light poles, traffic signposts, and cars. Comparative studies demonstrate the superior performance of the proposed algorithm over other existing methods.