Semantic simultaneous localization and mapping (SLAM) is a popular technology enabling indoor mobile robots to sufficiently perceive and interact with the environment. In this paper, we propose an object-aware semantic SLAM system, which consists of a quadric initialization method, an object-level data association method, and a multi-constraint optimization factor graph. To overcome the limitation of multi-view observations and the requirement of dense point clouds for objects, an efficient quadric initialization method based on object detection and surfel construction is proposed, which can efficiently initialize quadrics within fewer frames and with small viewing angles. The robust object-level joint data association method and the tightly coupled multi-constraint factor graph for quadrics optimization and joint bundle adjustment enable the accurate estimation of constructed quadrics and camera poses. Extensive experiments using public datasets show that the proposed system achieves competitive performance with respect to accuracy and robustness of object quadric estimation and camera localization compared with stateof-the-art methods.
I. INTRODUCTIONO BJECT representation is a key issue within object-level semantic SLAM, and appropriate representation can not only promote the robustness and accuracy of localization but also enhance the information of a semantic map oriented for human-robot interactions. There are many kinds of object representation methods, including preset models and general models, where prior object point clouds, cubes and quadrics are three kinds of common representation methods utilized for object-level semantic SLAM [1]-[10]. SLAM++ [1] presents an object-oriented 3D SLAM paradigm, which requires prior CAD models of objects. CubeSLAM [2] models objects as cubes, which is the first example of object-oriented SLAM. Compared with the cube methods, the quadric approach has a Manuscript