Networked robotic cameras are becoming popular in remote observation applications such as natural observation, surveillance, and distance learning. Equipped with a high optical zoom lens and agile pan-tilt mechanisms, a networked robotic camera can cover a large region with various resolutions. The optimal selection of camera control parameters for competing observation requests and the ondemand delivery of video content for various spatiotemporal queries are two challenges in the design of such autonomous systems. For camera control, we introduce memoryless and temporal frame selection models that effectively enable collaborative control of the camera based on the competing inputs from in-situ sensors and users. For content delivery, we design a patch-based motion panorama representation and coding/decoding algorithms (codec) to allow efficient storage and computation. We present system architecture, frame selection models, user interface, and codec algorithms. We have implemented the system and extensively tested our design in real world applications including natural observation, public surveillance, distance learning, and building construction monitoring. Experiment results show that our frame selection models are robust and effective and our ondemand content delivery codec can satisfy a variety of spatiotemporal queries efficiently in terms of computation time communications bandwidth.