Elan Rosenfeld scite author profile

Elan Rosenfeld

4Publications

22Citation Statements Received

109Citation Statements Given

How they've been cited

How they cite others

104

Affiliations

Publications

Order By: Most citations

Iterative Feature Matching: Toward Provable Domain Generalization with Logarithmic Environments

Rosenfeld¹,

Sellke²,

Ma³

et al. 2021

Preprint

View full text Add to dashboard Cite

Domain generalization aims at performing well on unseen test environments with data from a limited number of training environments. Despite a proliferation of proposal algorithms for this task, assessing their performance, both theoretically and empirically is still very challenging. Moreover, recent approaches such as Invariant Risk Minimization (IRM) require a prohibitively large number of training environments -linear in the dimension of the spurious feature space ds-even on simple data models like the one proposed by Rosenfeld et al. [2021b]. Under a variant of this model, we show that both ERM and IRM cannot generalize with o(ds) environments. We then present a new algorithm based on performing iterative feature matching that is guaranteed with high probability to yield a predictor that generalizes after seeing only O(log ds) environments.

show abstract

An Online Learning Approach to Interpolation and Extrapolation in Domain Generalization

Rosenfeld¹,

Ravikumar²,

Risteski³

2021

Preprint

View full text Add to dashboard Cite

A popular assumption for out-of-distribution generalization is that the training data comprises subdatasets, each drawn from a distinct distribution; the goal is then to "interpolate" these distributions and "extrapolate" beyond them-this objective is broadly known as domain generalization. A common belief is that ERM can interpolate but not extrapolate and that the latter is considerably more difficult, but these claims are vague and lack formal justification. In this work, we recast generalization over sub-groups as an online game between a player minimizing risk and an adversary presenting new test distributions. Under an existing notion of inter-and extrapolation based on reweighting of sub-group likelihoods, we rigorously demonstrate that extrapolation is computationally much harder than interpolation, though their statistical complexity is not significantly different. Furthermore, we show that ERM-or a noisy variant-is provably minimax-optimal for both tasks. Our framework presents a new avenue for the formal analysis of domain generalization algorithms which may be of independent interest.

show abstract

Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient for Out-of-Distribution Generalization

Rosenfeld¹,

Ravikumar²,

Risteski³

2022

Preprint

View full text Add to dashboard Cite

A common explanation for the failure of deep networks to generalize out-of-distribution is that they fail to recover the "correct" features. Focusing on the domain generalization setting, we challenge this notion with a simple experiment which suggests that ERM already learns sufficient features and that the current bottleneck is not feature learning, but robust regression. We therefore argue that devising simpler methods for learning predictors on existing features is a promising direction for future research. Towards this end, we introduce Domain-Adjusted Regression (DARE), a convex objective for learning a linear predictor that is provably robust under a new model of distribution shift. Rather than learning one function, DARE performs a domain-specific adjustment to unify the domains in a canonical latent space and learns to predict in this space. Under a natural model, we prove that the DARE solution is the minimax-optimal predictor for a constrained set of test distributions. Further, we provide the first finite-environment convergence guarantee to the minimax risk, improving over existing results which show a "threshold effect". Evaluated on finetuned features, we find that DARE compares favorably to prior methods, consistently achieving equal or better performance.

show abstract

Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation

Liu¹,

Rosenfeld²,

Ravikumar³

et al. 2021

Preprint

View full text Add to dashboard Cite

Noise-contrastive estimation (NCE) is a statistically consistent method for learning unnormalized probabilistic models. It has been empirically observed that the choice of the noise distribution is crucial for NCE's performance. However, such observations have never been made formal or quantitative. In fact, it is not even clear whether the difficulties arising from a poorly chosen noise distribution are statistical or algorithmic in nature. In this work, we formally pinpoint reasons for NCE's poor performance when an inappropriate noise distribution is used. Namely, we prove these challenges arise due to an ill-behaved (more precisely, flat) loss landscape. To address this, we introduce a variant of NCE called eNCE which uses an exponential loss and for which normalized gradient descent addresses the landscape issues provably when the target and noise distributions are in a given exponential family.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Elan Rosenfeld

Iterative Feature Matching: Toward Provable Domain Generalization with Logarithmic Environments

An Online Learning Approach to Interpolation and Extrapolation in Domain Generalization

Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient for Out-of-Distribution Generalization

Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation

Contact Info

Product

Resources

About