In the US, the challenges of an aging infrastructure network, coupled with requirements for maintaining continuous functionality of this network, highlights the need for innovative, and multidisciplinary solutions aimed at the timely detection and remediation of defects and deterioration before serious failure situations materialize. Considering the constant and widespread interactions of citizens with urban infrastructure systems, and the increasing ubiquity of mobile and personal electronic devices equipped with onboard sensing capabilities (e.g. camera, accelerometers, GPS, etc.), the concept of leveraging crowd-sourcing provides a promising data-driven solution for urban infrastructure monitoring. In this approach, the vision of the "citizen engineer" is introduced by empowering citizens to become "active human sensors" at the source of defect detection and data collection, thus extending the role of citizens from passive infrastructure users to active infrastructure monitors. In the proposed method, volunteers are motivated and instructed to use mobile devices to capture and send geo-tagged images of defects (e.g. cracks, corrosion, trip/slip hazards, potholes, etc.) that they observe in an urban infrastructure environment, including a short description and severity rating. The collected photos and descriptions are processed using object recognition techniques. Defects are identified and extracted from the photos and quantified, whereas additional information about the defects and their perceived severity are obtained from the description field. Beyond the local condition measures, the aggregate data provides responsible authorities with a quantitative analysis of the detected defects as well as a measure of importance and severity (i.e. heat maps), as perceived by the citizen, that can be used to inform maintenance decisions. While the challenges of this framework are discussed in detail, it is expected to be a highly promising departure from the traditional top-down infrastructure monitoring approaches.