A Deterministic Analysis of an Online Convex Mixture of Experts Algorithm

Özkan, Hüseyin; Donmez, Mehmet A.; Tunç, Sait; Kozat, Süleyman S.

doi:10.1109/tnnls.2014.2346832

Cited by 21 publications

(17 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Over the past years, the global optimization problem has gathered significant attention with various algorithms being proposed in distinct fields of research. It has been studied especially in the fields of non-convex optimization [6]- [8], Bayesian optimization [9], convex optimization [10]- [12], bandit optimization [13], stochastic optimization [14], [15]; because of its practical applications in distribution estimation [16]- [19], multi-armed bandits [20]- [22], control theory [23], signal processing [24], game theory [25], prediction [26], [27], decision theory [28] and anomaly detection [29]- [31].…”

Section: A Motivationmentioning

confidence: 99%

Low Regret Binary Sampling Method for Efficient Global Optimization of Univariate Functions

Gokcesu¹,

Gokcesu²

2022

Preprint

View full text Add to dashboard Cite

In this work, we propose a computationally efficient algorithm for the problem of global optimization in univariate loss functions. For the performance evaluation, we study the cumulative regret of the algorithm instead of the simple regret between our best query and the optimal value of the objective function. Although our approach has similar regret results with the traditional lower-bounding algorithms such as the Piyavskii-Shubert method for the Lipschitz continuous or Lipschitz smooth functions, it has a major computational cost advantage. In Piyavskii-Shubert method, for certain types of functions, the query points may be hard to determine (as they are solutions to additional optimization problems). However, this issue is circumvented in our binary sampling approach, where the sampling set is predetermined irrespective of the function characteristics. For a search space of [0, 1], our approach has at most L log(3T ) and 2.25H regret for L-Lipschitz continuous and H-Lipschitz smooth functions respectively. We also analytically extend our results for a broader class of functions that covers more complex regularity conditions.

show abstract

Section: A Motivationmentioning

confidence: 99%

Low Regret Binary Sampling Method for Efficient Global Optimization of Univariate Functions

Gokcesu¹,

Gokcesu²

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…In the problems of learning, recognition, estimation or prediction [1]- [3]; decisions are often produced to minimize certain loss functions using features of the observations, which are generally noisy, random or even missing. There are numerous applications in a number of varying fields such as decision theory [4], control theory [5], game theory [6], [7], optimization [8], [9], density estimation and anomaly detection [10]- [15], scheduling [16], signal processing [17], [18], forecasting [19], [20] and bandits [21]- [23]. These decisions are acquired from specific learning models, where the goal is to distinguish certain data patterns and provide accurate estimations for practical use.…”

Section: A Calibrationmentioning

confidence: 99%

Efficient, Anytime Algorithms for Calibration with Isotonic Regression under Strictly Convex Losses

Gokcesu¹,

Gokcesu²

2021

Preprint

View full text Add to dashboard Cite

We investigate the calibration of estimations to increase performance with an optimal monotone transform on the estimator outputs. We start by studying the traditional square error setting with its weighted variant and show that the optimal monotone transform is in the form of a unique staircase function. We further show that this staircase behavior is preserved for general strictly convex loss functions. Their optimal monotone transforms are also unique, i.e., there exist a single staircase transform that achieves the minimum loss. We propose a linear time and space algorithm that can find such optimal transforms for specific loss settings. Our algorithm has an online implementation where the optimal transform for the samples observed so far are found in linear space and amortized time when the samples arrive in an ordered fashion. We also extend our results to cases where the functions are not trivial to individually optimize and propose an anytime algorithm, which has linear space and pseudo-linearithmic time complexity.

show abstract

“…Then, the subtask models are learnt in parallel on the data from different working conditions by the same or diverse learning algorithms. The prediction of a query sample (only with input variables) by an OMM is the output from the subtask model that the query sample belongs to [29]- [31].…”

Section: The Oline Mixture Model Of Mach Number a The Mixture Learning For The Imbalanced Datamentioning

confidence: 99%

The Regression Learning of the Imbalanced and Big Data by the Online Mixture Model for the Mach Number Forecasting

et al. 2019

View full text Add to dashboard Cite

Extracting valuable information to enhance the performance of forecasting models from the imbalanced and big data requires the scalable implementation of advanced statistical learning methods. This paper proposes the online mixture model (OMM) and applies it to the Mach number forecasting. Treating the key variable (e.g., Mach number) forecasting under all working conditions as an entire task, and viewing that of each individual working condition as a subtask, the OMM separates the dense samples from the sparse ones on the basis of subtasks. The subtask models are independently learnt on the samples with reduced volume, and updated for the new working conditions without retaining samples from the old working conditions. Moreover, the tree-structure ensemble (TSE)-feature subsets ensembles (FSEs) algorithm is presented to fit the nonlinear function of a subtask model, where the FSE local models with low-dimensional input features are established on the non-overlapping sample subsets constructed by the TSE method. The TSE-FSEs not only reduce the volume of data but also perform distributed computing with parallel structure, and thus has the advantage of the learning of big data. Experiments carried out on the measurement data of wind tunnel indicate that the OMM with the TSE-FSEs outperforms other learning algorithms for the Mach number forecasting, and meets the precision and forecasting speed requirements in engineering.INDEX TERMS Big data, imbalanced data, mixture model, ensemble, online, regression, wind tunnel.

show abstract

A Deterministic Analysis of an Online Convex Mixture of Experts Algorithm

Cited by 21 publications

References 22 publications

Low Regret Binary Sampling Method for Efficient Global Optimization of Univariate Functions

Low Regret Binary Sampling Method for Efficient Global Optimization of Univariate Functions

Efficient, Anytime Algorithms for Calibration with Isotonic Regression under Strictly Convex Losses

The Regression Learning of the Imbalanced and Big Data by the Online Mixture Model for the Mach Number Forecasting

Contact Info

Product

Resources

About