The proposed method of combining deep residual learning, curriculum learning, and transfer learning translates to high nodule classification accuracy. This reveals a promising new direction for effective pulmonary nodule CAD systems that mirrors the success of recent deep learning advances in other image-based application domains.
Automatically determining three-dimensional human pose from monocular RGB image data is a challenging problem. The two-dimensional nature of the input results in intrinsic ambiguities which make inferring depth particularly difficult. Recently, researchers have demonstrated that the flexible statistical modelling capabilities of deep neural networks are sufficient to make such inferences with reasonable accuracy. However, many of these models use coordinate output techniques which are memory-intensive, not differentiable, and/or do not spatially generalise well. We propose improvements to 3D coordinate prediction which avoid the aforementioned undesirable traits by predicting 2D marginal heatmaps under an augmented soft-argmax scheme. Our resulting model, MargiPose, produces visually coherent heatmaps whilst maintaining differentiability. We are also able to achieve state-of-the-art accuracy on publicly available 3D human pose estimation data.
The need to store vast amounts of trajectory data becomes more problematic as GPS-based tracking devices become increasingly prevalent. There are two commonly used approaches for compressing trajectory data. The first is the line generalisation approach which aims to fit the trajectory using a series of line segments. The second is to store the initial data point and then store the remaining data points as a sequence of successive deltas. The line generalisation approach is only effective when given a large error margin, and existing delta compression algorithms do not permit lossy compression. Consequently there is an uncovered gap in which users expect a good compression ratio by giving away only a small error margin. This paper fills this gap by extending the delta compression approach to allow users to trade a small maximum error margin for large improvements to the compression ratio. In addition, alternative techniques are extensively studied for the following two key components of any delta-based approach: predicting the value of the next data point and encoding leading zeros. We propose a new trajectory compression system called Trajic based on the results of the study. Experimental results show that Trajic produces 1.5 times smaller compressed data than a straight-forward delta compression algorithm for lossless compression and produces 9.4 times smaller compressed data than a state-of-the-art line generalisation algorithm when using a small maximum error bound of 1 meter.
Due to recent advances in technology, the recording and analysis of video data has become an increasingly common component of athlete training programmes. Today it is incredibly easy and affordable to set up a fixed camera and record athletes in a wide range of sports, such as diving, gymnastics, golf, tennis, etc. However, the manual analysis of the obtained footage is a time-consuming task which involves isolating actions of interest and categorizing them using domain-specific knowledge. In order to automate this kind of task, three challenging sub-problems are often encountered: 1) temporally cropping events/actions of interest from continuous video; 2) tracking the object of interest; and 3) classifying the events/actions of interest.Most previous work has focused on solving just one of the above sub-problems in isolation. In contrast, this paper provides a complete solution to the overall action monitoring task in the context of a challenging real-world exemplar. Specifically, we address the problem of diving classification. This is a challenging problem since the person (diver) of interest typically occupies fewer than 1% of the pixels in each frame. The model is required to learn the temporal boundaries of a dive, even though other divers and bystanders may be in view. Finally, the model must be sensitive to subtle changes in body pose over a large number of frames to determine the classification code. We provide effective solutions to each of the sub-problems which combine to provide a highly functional solution to the task as a whole. The techniques proposed can be easily generalized to video footage recorded from other sports.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.