Nowadays, smartphones have become ubiquitous and one of the main communication resources for human beings. Their widespread adoption was due to the huge technological progress and to the development of multiple useful applications. Their characteristics have also experienced a substantial improvement as they now integrate multiple sensors able to convert the smartphone into a flexible and multi-purpose sensing unit. The combined use of multiple smartphones endowed with several types of sensors gives the possibility to monitor a certain area with fine spatial and temporal granularity, a procedure typically known as crowdsensing. In this paper, we propose using smartphones as environmental noise-sensing units. For this purpose, we focus our study on the sound capture and processing procedure, analyzing the impact of different noise calculation algorithms, as well as in determining their accuracy when compared to a professional noise measurement unit. We analyze different candidate algorithms using different types of smartphones, and we study the most adequate time period and sampling strategy to optimize the data-gathering process. In addition, we perform an experimental study comparing our approach with the results obtained using a professional device. Experimental results show that, if the smartphone application is well tuned, it is possible to measure noise levels with a accuracy degree comparable to professional devices for the entire dynamic range typically supported by microphones embedded in smartphones, i.e., 35–95 dB.