The golf swing is a complex movement requiring considerable full-body coordination to execute proficiently. As such, it is the subject of frequent scrutiny and extensive biomechanical analyses. In this paper, we introduce the notion of golf swing sequencing for detecting key events in the golf swing and facilitating golf swing analysis. To enable consistent evaluation of golf swing sequencing performance, we also introduce the benchmark database GolfDB, 1 consisting of 1400 high-quality golf swing videos, each labeled with event frames, bounding box, player name and sex, club type, and view type. Furthermore, to act as a reference baseline for evaluating golf swing sequencing performance on GolfDB, we propose a lightweight deep neural network called SwingNet, which possesses a hybrid deep convolutional and recurrent neural network architecture. SwingNet correctly detects eight golf swing events at an average rate of 76.1%, and six out of eight events at a rate of 91.8%. In line with the proposed baseline SwingNet, we advocate the use of computationally efficient models in future research to promote in-the-field analysis via deployment on readily-available mobile devices.
The ImageNet dataset ushered in a flood of academic and industry interest in deep learning for computer vision applications. Despite its significant impact, there has not been a comprehensive investigation into the demographic attributes of images contained within the dataset. Such a study could lead to new insights on inherent biases within ImageNet, particularly important given it is frequently used to pretrain models for a wide variety of computer vision tasks. In this work, we introduce a model-driven framework for the automatic annotation of apparent age and gender attributes in large-scale image datasets. Using this framework, we conduct the first demographic audit of the 2012 ImageNet Large Scale Visual Recognition Challenge (ILSVRC) subset of ImageNet and the 'person' hierarchical category of ImageNet. We find that 41.62% of faces in ILSVRC appear as female, 1.71% appear as individuals above the age of 60, and males aged 15 to 29 account for the largest subgroup with 27.11%. We note that the presented model-driven framework is not fair for all intersectional groups, so annotation are subject to bias. We present this work as the starting point for future development of unbiased annotation models and for the study of downstream effects of imbalances in the demographics of ImageNet.
Prostate cancer is the most commonly diagnosed cancer in North American men; however, prognosis is relatively good given early diagnosis. This motivates the need for fast and reliable prostate cancer sensing. Diffusion weighted imaging (DWI) has gained traction in recent years as a fast non-invasive approach to cancer sensing. The most commonly used DWI sensing modality currently is apparent diffusion coefficient (ADC) imaging, with the recently introduced computed high-b value diffusion weighted imaging (CHB-DWI) showing considerable promise for cancer sensing. In this study, we investigate the efficacy of ADC and CHB-DWI sensing modalities when applied to zone-level prostate cancer sensing by introducing several radiomics driven zone-level prostate cancer sensing strategies geared around hand-engineered radiomic sequences from DWI sensing (which we term as Zone-X sensing strategies). Furthermore, we also propose Zone-DR, a discovery radiomics approach based on zone-level deep radiomic sequencer discovery that discover radiomic sequences directly for radiomics driven sensing. Experimental results using 12,466 pathology-verified zones obtained through the different DWI sensing modalities of 101 patients showed that: (i) the introduced Zone-X and Zone-DR radiomics driven sensing strategies significantly outperformed the traditional clinical heuristics driven strategy in terms of AUC, (ii) the introduced Zone-DR and Zone-SVM strategies achieved the highest sensitivity and specificity, respectively for ADC amongst the tested radiomics driven strategies, (iii) the introduced Zone-DR and Zone-LR strategies achieved the highest sensitivities for CHB-DWI amongst the tested radiomics driven strategies, and (iv) the introduced Zone-DR, Zone-LR, and Zone-SVM strategies achieved the highest specificities for CHB-DWI amongst the tested radiomics driven strategies. Furthermore, the results showed that the trade-off between sensitivity and specificity can be optimized based on the particular clinical scenario we wish to employ radiomic driven DWI prostate cancer sensing strategies for, such as clinical screening versus surgical planning. Finally, we investigate the critical regions within sensing data that led to a given radiomic sequence generated by a Zone-DR sequencer using an explainability method to get a deeper understanding on the biomarkers important for zone-level cancer sensing.
Modern face recognition systems leverage datasets containing images of hundreds of thousands of specific individuals' faces to train deep convolutional neural networks to learn an embedding space that maps an arbitrary individual's face to a vector representation of their identity. The performance of a face recognition system in face verification (1:1) and face identification (1:N) tasks is directly related to the ability of an embedding space to discriminate between identities. Recently, there has been significant public scrutiny into the source and privacy implications of large-scale face recognition training datasets such as MS-Celeb-1M and MegaFace, as many people are uncomfortable with their face being used to train dualuse technologies that can enable mass surveillance. However, the impact of an individual's inclusion in training data on a derived system's ability to recognize them has not previously been studied. In this work, we audit ArcFace, a state-of-the-art, open source face recognition system, in a large-scale face identification experiment with more than one million distractor images. We find a Rank-1 face identification accuracy of 79.71% for individuals present in the model's training data and an accuracy of 75.73% for those not present. This modest difference in accuracy demonstrates that face recognition systems using deep learning work better for individuals they are trained on, which has serious privacy implications when one considers all major open source face recognition training datasets do not obtain informed consent from individuals during their collection. CCS CONCEPTS• Security and privacy → Social aspects of security and privacy; • Computing methodologies → Visual content-based indexing and retrieval; • Computer systems organization → Neural networks; • Social and professional topics → Surveillance.
Puck location in ice hockey is essential for hockey analysts for determining the location of play and analyzing game events. However, because of the difficulty involved in obtaining accurate annotations due to the extremely low visibility and commonly occurring occlusions of the puck, the problem is very challenging. The problem becomes even more challenging in broadcast videos with changing camera angles. We introduce a novel methodology for determining puck location from approximate puck location annotations in broadcast video. Our method uniquely leverages the existing puck location information that is publicly available in existing hockey event data and uses the corresponding one-second broadcast video clips as input to the network. The rationale behind using video as input instead of static images is that with video, the temporal information can be utilized to handle puck occlusions. The network outputs a heatmap representing the probability of the puck location using a 3D CNN based architecture. The network is able to regress the puck location from broadcast hockey video clips with varying camera angles. Experimental results demonstrate the capability of the method, achieving 47.07% AUC on the test dataset. The network is also able to estimate the puck location in defensive/offensive zones with an accuracy of greater than 80%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.