2018
DOI: 10.1103/physreve.98.032407
|View full text |Cite
|
Sign up to set email alerts
|

Correlation-compressed direct-coupling analysis

Abstract: Learning Ising or Potts models from data has become an important topic in statistical physics and computational biology, with applications to predictions of structural contacts in proteins and other areas of biological data analysis. The corresponding inference problems are challenging since the normalization constant (partition function) of the Ising/Potts distributions cannot be computed efficiently on large instances. Different ways to address this issue have hence given size to a substantial methodological… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
25
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 14 publications
(25 citation statements)
references
References 57 publications
0
25
0
Order By: Relevance
“…In contrast to our method, however, Cui et al did not attempt to disentangle the direct interaction from the indirect interactions. In a recent hybrid approach, Gao et al proposed filtering the data based on pairwise correlations and then fitting a joint model over the remaining sites in ( 49 ). The obvious advantage of a strict pairwise method, such as SpydrPick, is that its computational simplicity allows for scaling up to data sets beyond what is currently achievable by current DCA-based methods.…”
Section: Discussionmentioning
confidence: 99%
“…In contrast to our method, however, Cui et al did not attempt to disentangle the direct interaction from the indirect interactions. In a recent hybrid approach, Gao et al proposed filtering the data based on pairwise correlations and then fitting a joint model over the remaining sites in ( 49 ). The obvious advantage of a strict pairwise method, such as SpydrPick, is that its computational simplicity allows for scaling up to data sets beyond what is currently achievable by current DCA-based methods.…”
Section: Discussionmentioning
confidence: 99%
“…set to zero) a subset of the parameters observing that even though large spurious correlations may arise from non topologically connected sites, weak correlations are typically associated with small coupling strengths. As explained in [22], one can first determine a starting topology and then run the learning procedure on it. To this end, adabmDCA provides two distinct strategies.…”
Section: Pruning the Parametersmentioning
confidence: 99%
“…A practical way to control this behavior is to impose a sparsity prior over the coupling matrices: the two most used priors are the so called ℓ 1 and ℓ 2 regularizations, which force the inferred couplings to minimize the associated ℓ 1 and ℓ 2 norms multiplied by a tunable parameter that sets the regularization strength. A complementary approach consists in setting a priori a probable topology suggested by the mutual information between all pairs of residues [22]. Here, as discussed in the following section, we will follow an information-based decimation protocol originally proposed in [11].…”
Section: An Introduction To Boltzmann Learning Of Biological Modelsmentioning
confidence: 99%
“…Moreover the reduction of computational time obtained thanks to color compression and sparsity is necessary when dealing with larger number of sites, e.g. whole genome inference [27,52]. Last of all, let us emphasize that the color compression/decompression procedure introduced here is not restricted to pairwise graphical models.…”
Section: Discussionmentioning
confidence: 99%