The "least absolute shrinkage and selection operator" (Lasso) method has been adapted recently for network-structured datasets. In particular, this network Lasso method allows to learn graph signals from a small number of noisy signal samples by using the total variation of a graph signal for regularization. While efficient and scalable implementations of the network Lasso are available, only little is known about the conditions on the underlying network structure which ensure network Lasso to be accurate. By leveraging concepts of compressed sensing, we address this gap and derive precise conditions on the underlying network topology and sampling set which guarantee the network Lasso for a particular loss function to deliver an accurate estimate of the entire underlying graph signal. We also quantify the error incurred by network Lasso in terms of two constants which reflect the connectivity of the sampled nodes.
We apply network Lasso to solve binary classification and clustering problems on network structured data. In particular we generalize ordinary logistic regression to non-Euclidean data defined over a complex network structure. The resulting logistic network Lasso classifier amounts to solving a convex optimization problem. A scalable classification algorithm is obtained by applying the alternating direction methods of multipliers.
The network Lasso (nLasso) has been proposed recently as an efficient learning algorithm for massive networked data sets (big data over networks). It extends the well-known least absolute shrinkage and selection operator (Lasso) from learning sparse (generalized) linear models to network models. Efficient implementations of the nLasso have been obtained using convex optimization methods lending to scalable message passing protocols. In this paper, we analyze the statistical properties of nLasso when applied to localized linear regression problems involving networked data. Our main result is a sufficient condition on the network structure and available label information such that nLasso accurately learns a localized linear regression model from a few labeled data points. We also provide an implementation of nLasso for localized linear regression by specializing a primaldual method for solving the convex (non-smooth) nLasso problem.
GM2AP has a β-cup topology with numerous X-ray structures showing multiple conformations for some of the surface loops, revealing conformational flexibility that may be related to function, where function is defined as either membrane binding associated with ligand binding and extraction or interaction with other proteins. Here, site-directed spin labeling (SDSL) electron paramagnetic resonance (EPR) spectroscopy and molecular dynamic (MD) simulations are used to characterize the mobility and conformational flexibility of various structural regions of GM2AP. A series of 10 single cysteine amino acid substitutions were generated, and the constructs were chemically modified with the methanethiosulfonate spin label. Continuous wave (CW) EPR line shapes were obtained and subsequently simulated using the microscopic order macroscopic disorder (MOMD) program. Line shapes for sites that have multiple conformations in the X-ray structures required two spectral components, whereas spectra of the remaining sites were adequately fit with single-component parameters. For spin labeled sites L126C and I66C, spectra were acquired as a function of temperature, and simulations provided for the determination of thermodynamic parameters associated with conformational change. Binding to GM2 ligand did not alter the conformational flexibility of the loops, as evaluated by EPR and NMR spectroscopies. These results confirm that the conformational flexibility observed in the surface loops of GM2AP crystals is present in solution and that the exchange is slow on the EPR time scale (>ns). Furthermore, MD simulation results are presented and agree well with the conformational heterogeneity revealed by SDSL.
We apply the network Lasso to classify partially labeled data points which are characterized by high-dimensional feature vectors. In order to learn an accurate classifier from limited amounts of labeled data, we borrow statistical strength, via an intrinsic network structure, across the dataset. The resulting logistic network Lasso amounts to a regularized empirical risk minimization problem using the total variation of a classifier as a regularizer. This minimization problem is a non-smooth convex optimization problem which we solve using a primaldual splitting method. This method is appealing for big data applications as it can be implemented as a highly scalable message passing algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.