2015
DOI: 10.1016/j.neunet.2015.03.005
|View full text |Cite
|
Sign up to set email alerts
|

A biological mechanism for Bayesian feature selection: Weight decay and raising the LASSO

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(12 citation statements)
references
References 36 publications
0
10
0
1
Order By: Relevance
“…For the sake of simplicity, we used a raw pixel as the basic feature which could be selected as an informative subset of the input data space. It has been shown that specific forms of weight decay or regularization provide a mechanism for biologically plausible Bayesian feature selection [ 53 55 ]. In our ensemble system, selective projections from the input layer to the ensemble WTAs effectively implemented pixel/feature selection in this regard.…”
Section: Resultsmentioning
confidence: 99%
“…For the sake of simplicity, we used a raw pixel as the basic feature which could be selected as an informative subset of the input data space. It has been shown that specific forms of weight decay or regularization provide a mechanism for biologically plausible Bayesian feature selection [ 53 55 ]. In our ensemble system, selective projections from the input layer to the ensemble WTAs effectively implemented pixel/feature selection in this regard.…”
Section: Resultsmentioning
confidence: 99%
“…Compared to OLS, whose predicted coefficient is an unbiased estimator of both ridge regression and LASSO sacrifice a little bias in order to reduce the variance of the predicted values and improve the overall prediction accuracy. In this past decade, LASSO has been widely applied in many different ways and variants (Tibshirani et al 2005 ; Colombani et al 2013 ; Yamada et al 2014 ; Toiviainen et al 2014 ; Connor et al 2015 ).…”
Section: Methodsmentioning
confidence: 99%
“…If a total number of filters are K then a number of biases are K because each filter has a single bias [1], [2], [4], [8], [10], [11], [16]. Additionally, a stride and a pad are given along with a learning rate and weight decay [11], [17], [18].…”
Section: An Open Problem Of Setting Hyperparametersmentioning
confidence: 99%
“…This leads to ultimately overlapping receptive fields between the depth columns, producing ultimately large output volumes. The higher stride makes the receptive fields overlap less and the resulting output volume will have smaller dimensions [10], [11], [15], [17]. Zero-padding is used to exactly preserve the spatial size of output volumes.…”
Section: An Open Problem Of Setting Hyperparametersmentioning
confidence: 99%