Hayden S. Helm scite author profile

What is learning? 20 th century formalizations of learning theory-which precipitated revolutions in artificial intelligence-focus primarily on in-distribution learning, that is, learning under the assumption that the training data are sampled from the same distribution as the evaluation distribution. This assumption renders these theories inadequate for characterizing 21 st century real world data problems, which are typically characterized by evaluation distributions that differ from the training data distributions (referred to as out-of-distribution learning). We therefore make a small change to existing formal definitions of learnability by relaxing that assumption. We then introduce learning efficiency (LE) to quantify the amount a learner is able to leverage data for a given problem, regardless of whether it is an in-or out-of-distribution problem. We then define and prove the relationship between generalized notions of learnability, and show how this framework is sufficiently general to characterize transfer, multitask, meta, continual, and lifelong learning. We hope this unification helps bridge the gap between empirical practice and theoretical guidance in real world problems. Finally, because biological learning continues to outperform machine learning algorithms on certain OOD challenges, we discuss the limitations of this framework vis-á-vis its ability to formalize biological learning, suggesting multiple avenues for future research.

show abstract

Mental State Classification Using Multi-Graph Features

Chen

Helm

Lytvynets

et al. 2022

Front. Hum. Neurosci.

View full text Add to dashboard Cite

We consider the problem of extracting features from passive, multi-channel electroencephalogram (EEG) devices for downstream inference tasks related to high-level mental states such as stress and cognitive load. Our proposed feature extraction method uses recently developed spectral-based multi-graph tools and applies them to the time series of graphs implied by the statistical dependence structure (e.g., correlation) amongst the multiple sensors. We study the features in the context of two datasets each consisting of at least 30 participants and recorded using multi-channel EEG systems. We compare the classification performance of a classifier trained on the proposed features to a classifier trained on the traditional band power-based features in three settings and find that the two feature sets offer complementary predictive information. We conclude by showing that the importance of particular channels and pairs of channels for classification when using the proposed features is neuroscientifically valid.

show abstract

Representation Ensembling for Synergistic Lifelong Learning with Quasilinear Complexity

Vogelstein¹,

Dey²,

Helm³

et al. 2020

Preprint

View full text Add to dashboard Cite

Approximately optimal domain adaptation with Fisher's Linear Discriminant Analysis

Helm¹,

Yang²,

Silva³

et al. 2023

Preprint

View full text Add to dashboard Cite

We propose a class of models based on Fisher's Linear Discriminant (FLD) in the context of domain adaptation. The class is the convex combination of two hypotheses: i) an average hypothesis representing previously seen source tasks and ii) a hypothesis trained on a new target task. For a particular generative setting we derive the optimal convex combination of the two models under 0-1 loss, propose a computable approximation, and study the effect of various parameter settings on the relative risks between the optimal hypothesis, hypothesis i), and hypothesis ii). We demonstrate the effectiveness of the proposed optimal classifier in the context of EEG-and ECG-based classification settings and argue that the optimal classifier can be computed without access to direct information from any of the individual source tasks. We conclude by discussing further applications, limitations, and possible future directions.

show abstract

Leveraging semantically similar queries for ranking via combining representations

Helm¹,

Abdin²,

Pedigo³

et al. 2021

Preprint

View full text Add to dashboard Cite

In modern ranking problems, different and disparate representations of the items to be ranked are often available. It is sensible, then, to try to combine these representations to improve ranking. Indeed, learning to rank via combining representations is both principled and practical for learning a ranking function for a particular query. In extremely data-scarce settings, however, the amount of labeled data available for a particular query can lead to a highly variable and ineffective ranking function. One way to mitigate the effect of the small amount of data is to leverage information from semantically similar queries. Indeed, as we demonstrate in simulation settings and real data examples, when semantically similar queries are available it is possible to gainfully use them when ranking with respect to a particular query. We describe and explore this phenomenon in the context of the bias-variance trade off and apply it to the data-scarce settings of a Bing navigational graph and the Drosophila larva connectome.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hayden S. Helm

Towards a theory of out-of-distribution learning

Mental State Classification Using Multi-Graph Features

Representation Ensembling for Synergistic Lifelong Learning with Quasilinear Complexity

Approximately optimal domain adaptation with Fisher's Linear Discriminant Analysis

Leveraging semantically similar queries for ranking via combining representations

Contact Info

Product

Resources

About