2021
DOI: 10.1002/cjs.11667
|View full text |Cite
|
Sign up to set email alerts
|

Multivariate online regression analysis with heterogeneous streaming data

Abstract: New data collection and storage technologies have given rise to a new field of streaming data analytics, called real-time statistical methodology for online data analyses. Most existing online learning methods are based on homogeneity assumptions, which require the samples in a sequence to be independent and identically distributed. However, inter-data batch correlation and dynamically evolving batch-specific effects are among the key defining features of real-world streaming data such as electronic health rec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 34 publications
0
8
0
Order By: Relevance
“…However, the underlying model structures for different data batches are the same, that is, a common generalized linear model. A similar strategy has been used in Luo and Song [18], although they only focus on linear models and impose stricter restrictions on the patterns of how coefficients change.…”
Section: The Model Setupmentioning
confidence: 99%
See 2 more Smart Citations
“…However, the underlying model structures for different data batches are the same, that is, a common generalized linear model. A similar strategy has been used in Luo and Song [18], although they only focus on linear models and impose stricter restrictions on the patterns of how coefficients change.…”
Section: The Model Setupmentioning
confidence: 99%
“…The first one is that loading the whole dataset O * b is infeasible since the previous data O * b−1 is not available any more in an online setting; that is, we only have access to the current data batch O b and a set of historical summary statistics, denoted by H b−1 . Another one is that at each accumulation point b, it is possible that we have θ b = θ b−1 , meaning that θ b − θ b−1 has at least one nonzero component [18]. Due to the potential dynamic coefficients of the current data batch O b , using the undesirable historical information H b−1 directly might lead to incorrectly estimated regression coefficients for those we are interested in at the latest moment.…”
Section: The Model Setupmentioning
confidence: 99%
See 1 more Smart Citation
“…Wu et al (2021) proposed an online updating method of survival analysis under the Cox proportional hazards model. Luo and Song (2021) studied a multivariate online regression analysis with heterogeneous streaming data. Lin et al (2021) studied a homogenization strategy for heterogeneous streaming data.…”
Section: Introductionmentioning
confidence: 99%
“…Similarly, transfer learning for high-dimensional regression in generalized linear models aims to improve predictions and generally does not provide valid inference (Li et al, 2020(Li et al, , 2022Tian and Feng, 2022). Another viewpoint casts this problem in an online inference framework (Schifano et al, 2016;Toulis and Airoldi, 2017;Luo and Song, 2021) that assumes the true value of the parameter of interest in two sequentially considered datasets is the same.…”
Section: Introductionmentioning
confidence: 99%