Wide-area context awareness is a crucial enabling technology for next generation smart buildings and surveillance systems. It is not practical to gather this context awareness by covering the entire building with cameras. However, significant gaps in coverage caused by installing cameras in a sparse way can make it very difficult to infer the missing information. As a solution we advocate a class of hybrid perceptual systems that build a comprehensive model of activity in a large space, such as a building, by merging contextual information from a dense network of ultra-lightweight sensor nodes and video from a sparse network of cameras. In this paper we explore the task of automatically recovering the relative geometry between a pan-tilt-zoom camera and a network of one-bit motion detectors. We present results both for the recovery of geometry alone and also for the recovery of geometry jointly with simple activity models. Because we do not believe a metric calibration is necessary, or even entirely useful, for this task, we formulate and pursue the novel goal we term functional calibration. Functional calibration is a blending of geometry estimation and simple behavioral model discovery. Accordingly, results are evaluated by measuring the ability of the system to automatically foveate targets in a large, non-convex space, rather than by measuring, for example, pixel reconstruction error.