This research was conducted within the EU projects e-SENSE and SENSEI. Copyright c 2010 by Yang Zhang, Enschede, The Netherlands. All rights reserved. No part of this book may be reproduced or transmitted, in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without the prior written permission of the author.Printed by Wöhrmann Print Service.
AbstractThe generation of wireless sensor networks (WSNs) makes human beings observe and reason about the physical environment better, easier, and faster. The wireless sensor nodes equipped with sensing, processing, wireless communication and actuation capabilities can be densely deployed in a wide geographical area and measure various parameters continuously from the physical world. Compared with traditional environmental sensing technologies, such densely deployed WSNs enable collection of fine-grained high spatial and temporal resolution data with less installation, maintenance, and operation costs.However, raw sensor observations often have low data quality and reliability due to both internal and external factors including low quality of cheap sensors, dynamicity of network conditions, and harshness of the deployment environment. Use of low quality sensor data in any data analysis and decision making process will not only negatively impact analysis results and decisions made but also waste huge amount of valuable and limited network resources such as energy, as many incorrect values are transmitted. Low quality sensor data also prevents WSNs to fulfill their promises in terms of reliable real-time situation-awareness, as the low quality sensor data may generate large number of false alarms.Motivated by the need to improve quality of data analysis and decision making, enhance efficiency of using WSNs resources by preventing unnecessary transmission of erroneous sensor observations, and increase effectiveness of monitoring and situation-awareness capabilities of the WSNs, in this thesis we focus on online identification of outliers whenever and wherever they occur. Outliers in WSNs are those observations that represent erroneous values (errors) or indicate particular phenomenal changes (events). Our outlier detection techniques, which are based on distributed in-network data processing, identify sensor observations that do not conform to normal behavior of sensor data without using a pre-defined threshold or triggering conditions.Our main research objective is to design and implement effective and efficient outlier detection techniques for WSNs to identify outliers in an online and disv tributed manner and distinguish between errors and events with high accuracy and low false alarm, while maintaining the communication, computation and memory complexity low. Main contributions of this thesis can be summarized as: 3. Statistical-Based outlier detection techniques for WSNs. We take two approaches in designing our outlier detection techniques. One approach originates fr...