With the emergence of onboard vision processing for areas such as the internet of things (IoT), edge computing and autonomous robots, there is increasing demand for computationally efficient convolutional neural network (CNN) models to perform real-time object detection on resource constraints hardware devices. Tiny-YOLO is generally considered as one of the faster object detectors for low-end devices and is the basis for our work. Our experiments on this network have shown that Tiny-YOLO can achieve 0.14 frames per second (FPS) on the Raspberry Pi 3 B, which is too slow for soccer playing autonomous humanoid robots detecting goal and ball objects. In this paper we propose an adaptation to the YOLO CNN model named xYOLO, that can achieve object detection at a speed of 9.66 FPS on the Raspberry Pi 3 B. This is achieved by trading an acceptable amount of accuracy, making the network approximately 70 times faster than Tiny-YOLO. Greater inference speed-ups were also achieved on a desktop CPU and GPU. Additionally we contribute an annotated Darknet dataset for goal and ball detection.
In this paper, we present a novel pedestrian indoor positioning system that uses sensor fusion between a foot-mounted inertial measurement unit (IMU) and a vision-based fiducial marker tracking system. The goal is to provide an after-action review for first responders during training exercises. The main contribution of this work comes from the observation that different walking types (e.g., forward walking, sideways walking, backward walking) lead to different levels of position and heading error. Our approach takes this into account when accumulating the error, thereby leading to more-accurate estimations. Through experimentation, we show the variation in error accumulation and the improvement in accuracy alter when and how often to activate the camera tracking system, leading to better balance between accuracy and power consumption overall. The IMU and vision-based systems are loosely coupled using an extended Kalman filter (EKF) to ensure accurate and unobstructed positioning computation. The motion model of the EKF is derived from the foot-mounted IMU data and the measurement model from the vision system. Existing indoor positioning systems for training exercises require extensive active infrastructure installation, which is not viable for exercises taking place in a remote area. With the use of passive infrastructure (i.e., fiducial markers), the positioning system can accurately track user position over a longer duration of time and can be easily integrated into the environment. We evaluated our system on an indoor trajectory of 250 m. Results show that even with discrete corrections, near a meter level of accuracy can be achieved. Our proposed system attains the positioning accuracy of 0.55 m for a forward walk, 1.05 m for a backward walk, and 1.68 m for a sideways walk with a 90% confidence level.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.