, 86 pages Previous advances in Information Technologies and especially in Micro Electro-Mechanical Systems, have made the production and deployment of tiny, battery-powered nodes communicating over wireless links possible. Networks comprised of such nodes with sensing capability are called Wireless Sensor Networks. The early deployment aim was to use these nodes only in a passive way for indoor applications. These kinds of early nodes had the ability to sense scalar data such as temperature, humidity, pressure and location of surrounding objects. However, recently available sensor nodes have higher computation capability, higher storage space and better power solutions with respect to their predecessors. With these developments in addition to scalar data delivery, multimedia content has become the core focus. A wireless sensor network with multimedia capabilities is called Wireless Multimedia Sensor Network. There has always been a trade-off between accuracy and energy-efficiency in these new generation networks because of their resource-constrained nature. In this thesis we introduce a new approach to address this trade-off in Wireless Multimedia Sensor Networks. Although a number of previous studies have focused on various special topics in Wireless Multimedia Sensor Networks in detail, to the best of our knowledge, none presents a fuzzy multi-modal data fusion system, which is lightweight and provides a high accuracy ratio. Especially, a multi-modal data fuv sion system targeting surveillance applications make it inevitable to work within a multi-level hierarchical framework. In this thesis, our primary focus is on accuracy and efficiency by utilizing our framework. Along with the fuzzy fusion framework, a new fuzzy clustering algorithm, namely Multi-Objective Fuzzy Clustering Algorithm (MOFCA), is introduced and evaluated in detail as well.