2019
DOI: 10.1103/physrevb.100.134108
|View full text |Cite
|
Sign up to set email alerts
|

Robust cluster expansion of multicomponent systems using structured sparsity

Abstract: Identifying a suitable set of descriptors for modeling physical systems often utilizes either deep physical insights or statistical methods such as compressed sensing. In statistical learning, a class of methods known as structured sparsity regularization seeks to combine both physics-and statisticsbased approaches. Used in bioinformatics to identify genes for the diagnosis of diseases, group lasso is a well-known example. Here in physics, we present group lasso as an efficient method for obtaining robust clus… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
32
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 32 publications
(32 citation statements)
references
References 70 publications
(104 reference statements)
0
32
0
Order By: Relevance
“…Hence, we do not have to worry about dimensionality reduction, i.e., the discrimination between relevant and irrelevant many-body terms, a topic that is beyond the scope of this study, but has been adressed in the dedicated literature. 37,[66][67][68] Of course, these contributions are not known beforehand by the framework which has only access to a limited number of predictions from the reference model Hamiltonian (with eventual Gaussian errors added), just as if the predictions were computed from first-principle calculations.…”
Section: Setup For Proof Of Principlesmentioning
confidence: 99%
“…Hence, we do not have to worry about dimensionality reduction, i.e., the discrimination between relevant and irrelevant many-body terms, a topic that is beyond the scope of this study, but has been adressed in the dedicated literature. 37,[66][67][68] Of course, these contributions are not known beforehand by the framework which has only access to a limited number of predictions from the reference model Hamiltonian (with eventual Gaussian errors added), just as if the predictions were computed from first-principle calculations.…”
Section: Setup For Proof Of Principlesmentioning
confidence: 99%
“…2 Although conceptually well established and practically widely used, the CE method, especially regarding what is the optimal strategy to build CE models, has continuously attracted a lot of interest in the recent decades. [6][7][8][15][16][17][18][19][20][21][22][23][24][25][26][27][28] The main challenge is to build an accurate (unbiased) and robust (with low variance) CE model based on a limited number of training data obtained from…”
mentioning
confidence: 99%
“…Columns of Π S are the values for the included correlation functions evaluated for each of the training structures, and are referred to as correlation vectors. The several regression models that have been proposed in literature [4,[14][15][16] can be separated broadly into those dealing with over-determined linear systems and those dealing with underdetermined linear systems.…”
mentioning
confidence: 99%
“…The motivation for using an over-determined system comes from several studies showing decreasing cross validation errors with increasing number of training structures. [16,17] However this approach can be prone to over-fitting and lower prediction accuracy [18] which has usually been addressed with 2 regularization (ridge regression), [14] or by using some form feature selection, such as step-wise fitting, [17] evolutionary algorithms, [19,20] or some other form of p regularization. [16,21,22] In contrast, for the case of under-determined systems more correlation functions are included in the model than structures used when fitting.…”
mentioning
confidence: 99%
See 1 more Smart Citation