The growing number of 3D scene data available online brings in new challenges for scene retrieval, understanding, and synthesis. Traditional shape processing methods have difficulty to manage 3D scenes because such methods ignore the contextual information, that is, the spatial relationship between the objects or groups of objects, which plays a significant role in describing scenes. Therefore, a context-aware representation is needed to deal with such a problem. In this paper, we propose a method to build scene hierarchies based on contextual information. Given a 3D scene, we first use the interaction bisector surface to measure the affinity between different objects/elements of the scene and then apply the normalized cut method to build a hierarchical structure for the whole scene. The resulting hierarchical structure contains not only the relationship between the individual objects but also the relationship between object groups, which provides much richer information of the scene compared with a flat structure that only describes the contacts or affinity between the individual objects. We test our method using several public databases and show that the resulting structure is more consistent with the ground truth. We also show that our method can be used for point cloud segmentation and outperforms previous methods.
KEYWORDShierarchical structure, normalized cut, scene analysis, segmentation
INTRODUCTIONRecently, an increasing number of 3D scene databases have been available online because of the popularity of depth sensors and user-friendly modeling software. The ability to reuse such databases is critical for many content-hungry applications such as virtual worlds and games. To understand and retrieve existing 3D scene data, or synthesize new scenes based on existing ones, an essential problem is how to encode the scene information with a structural representation, which is compact, organized, and easy to use.Representing 3D scenes effectively is not a trivial task. The 3D scenes can be in a different format, such as depth data, point cloud data, 1 and artificial 3D models. 2 No matter what the format is, the representation should be able to encode not only the geometry of each object but also, more importantly, the contextual information of the scene. The spatial relationship between objects or object groups plays a significant role in describing the scene context. It describes how one object locates relative to another object, that is, A is on top of B, or A hooks on B, and how one object group locates relative to other groups, that is, group A is beside group B, or groups A and B are quite close while group C is a bit far away.