In this paper, we present a gamesourcing method for automatically and rapidly acquiring labeled images of human poses to obtain ground truth data as input for human pose estimation from 2D images. Typically, these datasets are constructed manually through a tedious process of clicking on joint locations in images. By using a low-cost RGBD sensor, we capture synchronized, registered images, depth maps, and skeletons of users playing a movement-based game and automatically filter the data to keep a subset of unique poses. Using a recently-developed, learning-based human pose estimation method, we demonstrate how data collected in this manner is as suitable for use as training data as existing, manually-constructed data sets.
While multi-camera methods for object tracking tend to outperform their single-camera counterparts, the data aggregation schemes can introduce new challenges, such as resource management and algorithm complexity. We present a framework for dynamically choosing the best subset of available cameras for tracking in real-time, which reduces aggregate tracking error and resource consumption and can be applied to a variety of existing base tracking models. We demonstrate on challenging video sequences of players in a basketball game. Our method is able to successfully track targets entering and exiting camera views and through occlusions, and overcome instances of single-view tracking drift.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.