Population growth and urbanization demand innovative strategies for sustainable city management. This paper focuses on the integration of the Internet of Things (IoT) and image processing technologies for environmental monitoring in sustainable urban development. The IoT forms an integral part of the Information and Communication Technology (ICT) infrastructure in smart sustainable cities. It offers a new model for urban design, due to the ability to offer environmentally sustainable alternatives. Furthermore, image processing is a method employed in computer vision that provides reliable approaches for extracting significant data from images. The convergence of these technologies has the capacity to enhance the effectiveness and durability of our urban surroundings. This paper discusses the current state-of-the-art in both IoT and image processing, highlighting their individual applications, architectures, and challenges. This paper explores the integration of the aforementioned technologies in a harmonized monitoring system to promote synergies and complementarities. Several case studies demonstrate the successful adoption of the harmonized approach in urban contexts, focusing on the environmental monitoring, energy management, transportation, and social wellbeing. The combination of IoT with image processing raises concerns regarding privacy, standardization, and scalability. The study has provided a direction for future research and suggested that more participant and multiple-strategy approaches could be beneficial to address some existing limitations and move toward a more sustainable urban context. It should therefore be viewed as a compass or a roadmap for future research in the areas of IoT and image processing-based monitoring towards todays and future sustainable urban environments.