Most positive and unlabeled data is subject to selection biases. The labeled examples can, for example, be selected from the positive set because they are easier to obtain or more obviously positive. This paper investigates how learning can be enabled in this setting. We propose and theoretically analyze an empirical-risk-based method for incorporating the labeling mechanism. Additionally, we investigate under which assumptions learning is possible when the labeling mechanism is not fully understood and propose a practical method to enable this. Our empirical analysis supports the theoretical results and shows that taking into account the possibility of a selection bias, even when the labeling mechanism is unknown, improves the trained classifiers.
Ground reaction forces are often used by sport scientists and clinicians to analyze the mechanical risk-factors of running related injuries or athletic performance during a running analysis. An interesting ground reaction force-derived variable to track is the maximal vertical instantaneous loading rate (VILR). This impact characteristic is traditionally derived from a fixed force platform, but wearable inertial sensors nowadays might approximate its magnitude while running outside the lab. The time-discrete axial peak tibial acceleration (APTA) has been proposed as a good surrogate that can be measured using wearable accelerometers in the field. This paper explores the hypothesis that applying machine learning to time continuous data (generated from bilateral tri-axial shin mounted accelerometers) would result in a more accurate estimation of the VILR. Therefore, the purpose of this study was to evaluate the performance of accelerometer-based predictions of the VILR with various machine learning models trained on data of 93 rearfoot runners. A subject-dependent gradient boosted regression trees (XGB) model provided the most accurate estimates (mean absolute error: 5.39 ± 2.04 BW•s −1 , mean absolute percentage error: 6.08%). A similar subject-independent model had a mean absolute error of 12.41 ± 7.90 BW•s −1 (mean absolute percentage error: 11.09%). All of our models had a stronger correlation with the VILR than the APTA (p < 0.01), indicating that multiple 3D acceleration features in a learning setting showed the highest accuracy in predicting the lab-based impact loading compared to APTA.
Background: Gait event detection of the initial contact and toe off is essential for running gait analysis, allowing the derivation of parameters such as stance time. Heuristic-based methods exist to estimate these key gait events from tibial accelerometry. However, these methods are tailored to very specific acceleration profiles, which may offer complications when dealing with larger data sets and inherent biological variability. Research question: Can a structured machine learning approach achieve a more accurate prediction of running gait event timings from tibial accelerometry, compared to the previously utilised heuristic approaches? Methods: Force-based event detection acted as the criterion measure in order to assess the accuracy, repeatability and sensitivity of the predicted gait events. 3D tibial acceleration and ground reaction force data from 93 rearfoot runners were captured. A heuristic method and two structured machine learning methods were employed to derive initial contact, toe off and stance time from tibial acceleration signals. Results: Both a structured perceptron model (median absolute error of stance time estimation: 10.00 ± 8.73 ms) and a structured recurrent neural network model (median absolute error of stance time estimation: 6.50 ± 5.74 ms) significantly outperformed the existing heuristic approach (median absolute error of stance time estimation: 11.25 ± 9.52 ms). Thus, results indicate that a structured recurrent neural network machine learning model offers the most accurate and consistent estimation of the gait events and its derived stance time during level overground running. Significance: The machine learning methods seem less affected by intra-and inter-subject variation within the data, allowing for accurate and efficient automated data output during rearfoot overground running. Furthermore offering possibilities for real-time monitoring and biofeedback during prolonged measurements, even outside the laboratory.
We present QLAD, an anomaly detection system that is designed for the high query volume and the specific nature of DNS traffic at a TLD resolver. QLAD integrates three components that implement the complete anomaly detection process, ranging from the ingression of raw traffic data to the visualisation of detected anomalies. With an initial analysis of query logs from the Belgian ccTLD registry, we showed that QLAD can archive data compactly, has a low computational cost and can detect a wide range of anomalies. We found several anomalies that are of interest to the registry operator, such as domain enumerations and DoS attacks. Other anomalies were caused by benign applications with unique traffic patterns. A user interface helps to distinguish these, but correctly identifying all anomalies remains a difficult and tedious task.
In-game win probabilities for the game between Manchester City and Queens Park Rangers (QPR) on the final day of the 2011/12 Premier League season.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.