We investigate the learning performance of the pseudolikelihood maximization method for inverse Ising problems. In the teacher–student scenario under the assumption that the teacher’s couplings are sparse and the student does not know the graphical structure, the learning curve and order parameters are assessed in the typical case using the replica and cavity methods from statistical mechanics. Our formulation is also applicable to a certain class of cost functions having locality; the standard likelihood does not belong to that class. The derived analytical formulas indicate that the perfect inference of the presence/absence of the teacher’s couplings is possible in the thermodynamic limit taking the number of spins N as infinity while keeping the dataset size M proportional to N, as long as α = M/N > 2. Meanwhile, the formulas also show that the estimated coupling values corresponding to the truly existing ones in the teacher tend to be overestimated in the absolute value, manifesting the presence of estimation bias. These results are considered to be exact in the thermodynamic limit on locally tree-like networks, such as the regular random or Erdős–Rényi graphs. Numerical simulation results fully support the theoretical predictions. Additional biases in the estimators on loopy graphs are also discussed.
We investigate leave-one-out cross validation (CV) as a determinator of the weight of the penalty term in the least absolute shrinkage and selection operator (LASSO). First, on the basis of the message passing algorithm and a perturbative discussion assuming that the number of observations is sufficiently large, we provide simple formulas for approximately assessing two types of CV errors, which enable us to significantly reduce the necessary cost of computation. These formulas also provide a simple connection of the CV errors to the residual sums of squares between the reconstructed and the given measurements. Second, on the basis of this finding, we analytically evaluate the CV errors when the design matrix is given as a simple random matrix in the large size limit by using the replica method. Finally, these results are compared with those of numerical simulations on finite-size systems and are confirmed to be correct. We also apply the simple formulas of the first type of CV error to an actual dataset of the supernovae.
The phase diagram of the p-spin-interacting spin glass model in a transverse field is investigated in the limit p ! 1 under the presence of ferromagnetic bias. Using the replica method and the static approximation, we show that the phase diagram consists of four phases: Quantum paramagnetic, classical paramagnetic, ferromagnetic, and spin-glass phases. We also show that the static approximation is valid in the ferromagnetic phase in the limit p ! 1 by using the large-p expansion. Since the same approximation is already known to be valid in other phases, we conclude that the obtained phase diagram is exact.
We discuss a strategy of sparse approximation that is based on the use of an overcomplete basis, and evaluate its performance when a random matrix is used as this basis. A small combination of basis vectors is chosen from a given overcomplete basis, according to a given compression rate, such that they compactly represent the target data with as small a distortion as possible. As a selection method, we study the ℓ 0 -and ℓ 1 -based methods, which employ the exhaustive search and ℓ 1 -norm regularization techniques, respectively. The performance is assessed in terms of the tradeoff relation between the representation distortion and the compression rate. First, we evaluate the performance analytically in the case that the methods are carried out ideally, using methods of statistical mechanics. The analytical result is then confirmed by performing numerical experiments on finite size systems, and extrapolating the results to the infinite-size limit. Our result clarifies the fact that the ℓ 0 -based method greatly outperforms the ℓ 1 -based one. An interesting outcome of our analysis is that any small value of distortion is achievable for any fixed compression rate r in the large-size limit of the overcomplete basis, for both the ℓ 0 -and ℓ 1 -based methods. The difference between these two methods is manifested in the size of the overcomplete basis that is required in order to achieve the desired value for the distortion. As the desired distortion decreases, the required size grows in a polynomial and an exponential manners for the ℓ 0 -and ℓ 1 -based methods, respectively. Second, we examine the practical performances of two well-known algorithms, orthogonal matching pursuit and approximate message passing, when they are used to execute the ℓ 0 -and ℓ 1 -based methods, respectively. Our examination shows that orthogonal matching pursuit achieves a much better performance than the exact execution of the ℓ 1 -based method, as well as approximate message passing. However, regarding the ℓ 0 -based method, there is still room to design more effective greedy algorithms than orthogonal matching pursuit. Finally, we evaluate the performances of the algorithms when they are applied to image data compression.
Abstract. The weight space of the Ising perceptron in which a set of random patterns is stored is examined using the generating function of the partition function φ(n) = (1/N ) log[Z n ] as the dimension of the weight vector N tends to infinity, where Z is the partition function and [· · ·] represents the configurational average. We utilize φ(n) for two purposes, depending on the value of the ratio α = M/N , where M is the number of random patterns. For α < α s = 0.833 . . ., we employ φ(n), in conjunction with Parisi's one-step replica symmetry breaking scheme in the limit of n → 0, to evaluate the complexity that characterizes the number of disjoint clusters of weights that are compatible with a given set of random patterns, which indicates that, in typical cases, the weight space is equally dominated by a single large cluster of exponentially many weights and exponentially many small clusters of a single weight. For α > α s , on the other hand, φ(n) is used to assess the rate function of a small probability that a given set of random patterns is atypically separable by the Ising perceptrons. We show that the analyticity of the rate function changes at α = α GD = 1.245 . . ., which implies that the dominant configuration of the atypically separable patterns exhibits a phase transition at this critical ratio. Extensive numerical experiments are conducted to support the theoretical predictions. § obuchi@stat.phys.titech.ac.jpWeight space structure and analysis using a finite replica number in the Ising perceptron2
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.