In this work, we deal with the problem of activity recognition in Cricket telecast videos. We present a supervised approach for recognizing Cricket stroke categories using two variants of Bag of Visual words (BoV) model applied on dense optical flow based motion feature i.e., grid-based flattened vectors and orientation histograms, 3D ResNet extracted features and 2D ResNet extracted spatial features. These globally extracted features, in spite of the noise due to camera motion, give good results on the Cricket strokes dataset having 562 trimmed stroke videos. We independently labeled the strokes based on the direction of stroke play and the direction of camera motion, into five and three categories. respectively, and provide experimental analysis on Hard Assignment (HA) and Soft Assignment (SA) based BoV methods.Our contribution lies in showing the effectiveness of the unordered BoV representation for direction based action recognition. The experimental analysis presented in our work provides an insight into the BoV HA and SA models applied to global frame-level feature descriptors. Moreover, our experiments suggest the orientation histograms to be simple yet effective for direction based recognition tasks. The best accuracy achieved on the Cricket strokes test partition was 85.85% with orientation histograms and 82.08% with grid features, for 3 category and 5 category labels, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.