A Fast Trust-Region Newton Method for Softmax Logistic Regression

Zaidi, Nayyar Abbas; Webb, Geoffrey I.

doi:10.1137/1.9781611974973.79

Cited by 4 publications

(4 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In our experiments, we applied Lib Linear [7], a widely used and efficient toolkit for logistic regression which uses truncated Newton optimization [9]. In a recent paper, Zaidi et al [27] note that among the many optimization methods that have been evaluated, the truncated Newton method has been shown to converge the fastest, which provides support that Lib Linear is a competitive, state-of-the-art method to apply in our evaluation, as a baseline point of comparison. As with MVP, the appended term on LR denotes the maximum polynomial degree of the regressors.…”

Section: Simulation Studiesmentioning

confidence: 99%

Scoring Bayesian networks of mixed variables

Andrews

Ramsey

Cooper

2018

Int J Data Sci Anal

View full text Add to dashboard Cite

In this paper we outline two novel scoring methods for learning Bayesian networks in the presence of both continuous and discrete variables, that is, mixed variables. While much work has been done in the domain of automated Bayesian network learning, few studies have investigated this task in the presence of both continuous and discrete variables while focusing on scalability. Our goal is to provide two novel and scalable scoring functions capable of handling mixed variables. The first method, the Conditional Gaussian (CG) score, provides a highly efficient option. The second method, the Mixed Variable Polynomial (MVP) score, allows for a wider range of modeled relationships, including non-linearity, but it is slower than CG. Both methods calculate log likelihood and degrees of freedom terms, which are incorporated into a Bayesian Information Criterion (BIC) score. Additionally, we introduce a structure prior for efficient learning of large networks and a simplification in scoring the discrete case which performs well empirically. While the core of this work focuses on applications in the search and score paradigm, we also show how the introduced scoring functions may be readily adapted as conditional independence tests for constraint-based Bayesian network learning algorithms. Lastly, we describe ways to simulate networks of mixed variable types and evaluate our proposed methods on such simulations.

show abstract

Section: Simulation Studiesmentioning

confidence: 99%

Scoring Bayesian networks of mixed variables

Andrews

Ramsey

Cooper

2018

Int J Data Sci Anal

View full text Add to dashboard Cite

show abstract

“…z lc is the corresponding class label of the l th subject for the c th task, and

is the data vector of the l th subject for the c th task. This objective is usually called the negative log-likelihood and is convex ( Zaidi and Webb, 2017 ).…”

Section: Methodsmentioning

confidence: 99%

“…According to Krishnapuram et al (2005) and Lee et al (2006) , we first obtain the first-order derivative of the logistic objective with respect to each v jc

where

is the class posterior probability ( Zaidi and Webb, 2017 ). Here we use

denote the l th row and j th column element of matrix

.…”

Section: Methodsmentioning

confidence: 99%

Identifying diagnosis-specific genotype–phenotype associations via joint multitask sparse canonical correlation analysis and classification

Liu

et al. 2020

Bioinformatics

View full text Add to dashboard Cite

Motivation Brain imaging genetics studies the complex associations between genotypic data such as single nucleotide polymorphisms (SNPs) and imaging quantitative traits (QTs). The neurodegenerative disorders usually exhibit the diversity and heterogeneity, originating from which different diagnostic groups might carry distinct imaging QTs, SNPs and their interactions. Sparse canonical correlation analysis (SCCA) is widely used to identify bi-multivariate genotype–phenotype associations. However, most existing SCCA methods are unsupervised, leading to an inability to identify diagnosis-specific genotype–phenotype associations. Results In this article, we propose a new joint multitask learning method, named MT–SCCALR, which absorbs the merits of both SCCA and logistic regression. MT–SCCALR learns genotype–phenotype associations of multiple tasks jointly, with each task focusing on identifying one diagnosis-specific genotype–phenotype pattern. Meanwhile, MT–SCCALR cannot only select relevant SNPs and imaging QTs for each diagnostic group alone, but also allows the selection of those shared by multiple diagnostic groups. We derive an efficient optimization algorithm whose convergence to a local optimum is guaranteed. Compared with two state-of-the-art methods, MT–SCCALR yields better or similar canonical correlation coefficients and classification performances. In addition, it owns much better discriminative canonical weight patterns of great interest than competitors. This demonstrates the power and capability of MTSCCAR in identifying diagnostically heterogeneous genotype–phenotype patterns, which would be helpful to understand the pathophysiology of brain disorders. Availability and implementation The software is publicly available at https://github.com/dulei323/MTSCCALR. Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

“…Several variants that are popular in this category for optimizing LR related models is that of 'Truncated Newton method' [26] -also known as TRON, conjugate gradient, etc. [27].…”

Section: ) Second Order Methodsmentioning

confidence: 99%

On the Effectiveness of Discretizing Quantitative Attributes in Linear Classifiers

Zaidi

Webb

2020

IEEE Access

Self Cite

View full text Add to dashboard Cite

Linear models in machine learning are extremely computational efficient but they have high representation bias due to non-linear nature of many real-world datasets. In this paper, we show that this representation bias can be greatly reduced by discretization. Discretization is a common procedure in machine learning that is used to convert a quantitative attribute into a qualitative one. It is often motivated by the limitation of some learners to handle qualitative data. Since discretization looses information (as fewer distinctions among instances are possible using discretized data relative to undiscretized data) -where discretization is not essential, it might appear desirable to avoid it, and typically, it is avoided. However, in the past, it has been shown that discretization can leads to superior performance on generative linear models, e.g., naive Bayes. This motivates a systematic study of the effects of discretizing quantitative attributes for discriminative linear models, as well. In this paper, we demonstrate that, contrary to prevalent belief, discretization of quantitative attributes, for discriminative linear models, is a beneficial pre-processing step, as it leads to far superior classification performance, especially on bigger datasets, and surprisingly, much better convergence, which leads to better training time. We substantiate our claims with an empirical study on 52 benchmark datasets, using three linear models optimizing different objective functions.

show abstract

A Fast Trust-Region Newton Method for Softmax Logistic Regression

Cited by 4 publications

References 9 publications

Scoring Bayesian networks of mixed variables

Scoring Bayesian networks of mixed variables

Identifying diagnosis-specific genotype–phenotype associations via joint multitask sparse canonical correlation analysis and classification

On the Effectiveness of Discretizing Quantitative Attributes in Linear Classifiers

Contact Info

Product

Resources

About