Roads are the most heavily affected aspect of urban infrastructure given the ever-increasing number of vehicles needed to provide mobility to residents, supply them with goods, and help sustain urban growth. An important indicator of degrading road infrastructure is the so-called bump features of the road surface (BFRS), which have affected transportation safety and driving experience. To collect BFRS, we can collect discrete-sampled, non-homogeneous multi-sensor stream data. We propose a BFRS detection method based on spectrum modeling and multi-dimensional features. With the sampling rate of GPS at 1 Hz and a gyroscope and accelerometer at 100 Hz, multi-sensor stream data are recorded at three different urban areas of Nanjing, China, using the smartphone mounted on a vehicle. The recorded stream data captures a geometric feature modeling movement and the respective driving conditions. Derived features also include acceleration, orientation, and speed information. To capture bump features, we develop a deep-learning-based approach based on so-called spectrum features. BFRS detection experiments using multi-sensor stream data from smartphones are conducted, and 4, 14, and 17 BFRS are correctly detected in three different areas, with the precision as 100%, 70.00%, and 77.27%, respectively. Then, comparisons are conducted between the proposed method and three other methods, and the F-score of the proposed method is computed as 1.0000, 0.6363, and 0.7555 at three different areas, which hold the highest value among all results. Finally, it shows that the proposed method performs well in different geographic areas.