Tracking the 6D pose of objects in video sequences is important for robot manipulation. This work presents se(3)-TrackNet, a data-driven optimization approach for longterm, 6D pose tracking. It aims to identify the optimal relative pose given the current RGB-D observation and a synthetic image conditioned on the previous best estimate and the object's model. The key contribution in this context is a novel neural network architecture, which appropriately disentangles the feature encoding to help reduce domain shift, and an effective 3D orientation representation via Lie Algebra. Consequently, even when the network is trained solely with synthetic data can work effectively over real images. Comprehensive experiments over multiple benchmarks show se(3)-TrackNet achieves consistently robust estimates and outperforms alternatives, even though they have been trained with real images. The approach runs in real time at 90.9Hz. Code, data and supplementary video for this project are available at 1 https://github.com/wenbowen123/iros20-6d-pose-tracking 1 This work has been previously accepted to appear at IEEE IROS 2020 [27]. This is a CVPRW'21 paper on 3D Vision and Robotics.
Trespassing is the leading cause of rail-related deaths and has been on the rise for the past 10 years. Detection of unsafe trespassing of railroad tracks is critical for understanding and preventing fatalities. Witnessing these events has become possible with the widespread deployment of large volumes of surveillance video data in the railroad industry. This potential source of information requires immense labor to monitor in real time. To address this challenge this paper describes an artificial intelligence (AI) framework for the automatic detection of trespassing events in real time. This framework was implemented on three railroad video live streams, a grade crossing and two right-of-ways, in the United States. The AI algorithm automatically detects trespassing events, differentiates between the type of violator (car, motorcycle, truck, pedestrian, etc.) and sends an alert text message to a designated destination with important information including a video clip of the trespassing event. In this study, the AI has analyzed hours of live footage with no false positives or missed detections yet. This paper and its subsequent studies aim to provide the railroad industry with state-of-the-art AI tools to harness the untapped potential of an existing closed-circuit television infrastructure through the real-time analysis of their data feeds. The data generated from these studies will potentially help researchers understand human factors in railroad safety research and give them a real-time edge on tackling the critical challenges of trespassing in the railroad industry.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.