Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence 2018
DOI: 10.24963/ijcai.2018/369
|View full text |Cite
|
Sign up to set email alerts
|

Online Deep Learning: Learning Deep Neural Networks on the Fly

Abstract: Deep Neural Networks (DNNs) are typically trained by backpropagation in a batch learning setting, which requires the entire training data to be made available prior to the learning task. This is not scalable for many real-world scenarios where new data arrives sequentially in a stream form. We aim to address an open challenge of "Online Deep Learning" (ODL) for learning DNNs on the fly in an online setting. Unlike traditional online learning that often optimizes some convex objective function with respect to a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
136
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 221 publications
(136 citation statements)
references
References 6 publications
0
136
0
Order By: Relevance
“…The structural learning scenario is mainly driven by feature similarity and does not fully operate in the one-pass learning mode. [12] puts forward the hedge backpropagation method to answer the research question as to how and when a DNN structure should be adapted. This work, however, assumes that an initial structure of DNN exists and is built upon a fixed-capacity network.…”
Section: Related Workmentioning
confidence: 99%
“…The structural learning scenario is mainly driven by feature similarity and does not fully operate in the one-pass learning mode. [12] puts forward the hedge backpropagation method to answer the research question as to how and when a DNN structure should be adapted. This work, however, assumes that an initial structure of DNN exists and is built upon a fixed-capacity network.…”
Section: Related Workmentioning
confidence: 99%
“…The key difference lies in the calculation of mean and variance directly obtained from the bias itself rather than from the binomial distribution because the hidden unit growing strategy analyzes a real variablebias instead of the accuracy score. The high bias problem leading to the introduction of new hidden unit is formulated as follows: µ t bias + σ t bias ≥ µ min bias + πσ min bias (4) where π = 1.25exp(−Bias 2 ) + 0.75 and controls the confidence degree of sigma rule. It is observed that π is set adaptive as a factor of Bias and revolves around [1,2].…”
Section: Adaptive Learning Strategy Of Network Widthmentioning
confidence: 99%
“…µ t var + σ t var ≥ µ min var + 2χσ min var (5) Compared to (4), the term 2 is introduced and meant to avoid a direct pruning after adding situation since addition of a new hidden unit leads to temporary increase of network variance but gradually decreases as next observations are come across. χ is set akin to π in the (4) as π = 1.25exp(−V ariance 2 ) + 0.75 which consequently causes k sigma rule in the range of [1,4]. This strategy navigates to between 68.2% and 99.9% confidence level.…”
Section: Adaptive Learning Strategy Of Network Widthmentioning
confidence: 99%
See 2 more Smart Citations