In this paper an interaction framework for AR enhanced video conferencing is presented. The goal is to provide a cheap and portable system based on a combination of commodity Kinect cameras and regular computer screens. These conditions necessitate the use of contact free interaction methods. The interaction framework presented in this paper is specifically suited for remotely presenting, sharing and annotating visual data such as images, presentation slides and 3D objects. In the proposed system all data is represented by freely manipulable 3D objects which are augmented into the camera views. These representations are integrated into a differentiated ownership scheme, allowing for operations such as spatially managed data sharing. The suitability of different interaction paradigms with regards to this usage scenario is examined. Furthermore, occlusion and collision management between virtual objects and real obstacles is enabled by integrating basic models of the environment.