Walter Gerych scite author profile

Positive-Unlabeled (PU) learning methods train a classifier to distinguish between the positive and negative classes given only positive and unlabeled data. While traditional PU methods require the labeled positive samples to be an unbiased sample of the positive distribution, in practice the labeled sample is often a biased draw from the true distribution. Prior work shows that if we know the likelihood that each positive instance will be selected for labeling, referred to as the propensity score, then the biased sample can be used for PU learning. Unfortunately, no prior work has been proposed an inference strategy for which the propensity score is identifiable. In this work, we propose two sets of assumptions under which the propensity score can be uniquely determined: one in which no assumption is made on the functional form of the propensity score (requiring assumptions on the data distribution), and the second which loosens the data assumptions while assuming a functional form for the propensity score. We then propose inference strategies for each case. Our empirical study shows that our approach significantly outperforms the state-of-the-art propensity estimation methods on a rich variety of benchmark datasets.

show abstract

DELFI: Mislabelled Human Context Detection Using Multi-Feature Similarity Linking

Mansoor

Gerych

Buquicchio

et al. 2019

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Walter Gerych

DeepContext: Parameterized Compatibility-Based Attention CNN for Human Context Recognition

Smartphone Health Biomarkers: Positive Unlabeled Learning of In-the-Wild Contexts

Classifying Depression in Imbalanced Datasets Using an Autoencoder- Based Anomaly Detection Approach

Recovering the Propensity Score from Biased Positive Unlabeled Data

DELFI: Mislabelled Human Context Detection Using Multi-Feature Similarity Linking

Contact Info

Product

Resources

About