Smart surveillance systems become more meaningful if they both grow in reliability and robustness, while simultaneously offering a higher semantic level of understanding. To achieve a higher level of semantic scene understanding, the objects and their actions have to be interpreted in the given context, so that the extraction of contextual information is required. This chapter explores several techniques for extracting the contextual information such as spatial, motion, depth and co-occurrence, depending on applications. Afterwards, the chapter provides specific case studies to evaluate the usefulness of context information, based on: (1) region labeling of the surroundings of objects, (2) motion analysis of the water for moving ships, (3) traffic sign recognition for safety event evaluation and (4) the use of depth signals for obstacle detection. The chapter shows that the previous cases can be solved in an improved way with respect to robustness and semantic understanding. Case studies indicate up to 6.8% improvement of reliable correct object understanding and the novel possibility of labeling scene events as safe/unsafe depending on the object behavior and the detected surrounding context. In this chapter, it is shown that using contextual information improves automated video surveillance analysis, as it not only improves the reliability of moving object detection, but also enables scene understanding that is far beyond object understanding.