Ilija Ilievski scite author profile

Ilija Ilievski

5Publications

87Citation Statements Received

15Citation Statements Given

How they've been cited

193

How they cite others

Affiliations

National University of Singapore, Saints Cyril and Methodius University of Skopje

Publications

Order By: Most citations

SWIM: A Simple Word Interaction Model for Implicit Discourse Relation Recognition

Lei

Wang

Liu

et al. 2017

View full text Add to dashboard Cite

Capturing the semantic interaction of pairs of words across arguments and proper argument representation are both crucial issues in implicit discourse relation recognition. The current state-ofthe-art represents arguments as distributional vectors that are computed via bi-directional Long Short-Term Memory networks (BiLSTMs), known to have significant model complexity.In contrast, we demonstrate that word-weighted averaging can encode argument representation which can be incorporated with word pair information efficiently. By saving an order of magnitude in parameters and eschewing the recurrent structure, our proposed model achieves equivalent performance, but trains seven times faster.

show abstract

Efficient Hyperparameter Optimization for Deep Learning Algorithms Using Deterministic RBF Surrogates

Ilievski

Akhtar

Feng

et al. 2017

AAAI

View full text Add to dashboard Cite

Automatically searching for optimal hyperparameter configurations is of crucial importance for applying deep learning algorithms in practice. Recently, Bayesian optimization has been proposed for optimizing hyperparameters of various machine learning algorithms. Those methods adopt probabilistic surrogate models like Gaussian processes to approximate and minimize the validation error function of hyperparameter values. However, probabilistic surrogates require accurate estimates of sufficient statistics (e.g., covariance) of the error distribution and thus need many function evaluations with a sizeable number of hyperparameters. This makes them inefficient for optimizing hyperparameters of deep learning algorithms, which are highly expensive to evaluate. In this work, we propose a new deterministic and efficient hyperparameter optimization method that employs radial basis functions as error surrogates. The proposed mixed integer algorithm, called HORD, searches the surrogate for the most promising hyperparameter values through dynamic coordinate search and requires many fewer function evaluations. HORD does well in low dimensions but it is exceptionally better in higher dimensions. Extensive evaluations on MNIST and CIFAR-10 for four deep neural networks demonstrate HORD significantly outperforms the well-established Bayesian optimization methods such as GP, SMAC, and TPE. For instance, on average, HORD is more than 6 times faster than GP-EI in obtaining the best configuration of 19 hyperparameters.

show abstract

Generative Attention Model with Adversarial Self-learning for Visual Question Answering

Ilievski

Feng

2017

View full text Add to dashboard Cite

Personalized news recommendation based on implicit feedback

Ilievski

Roy

2013

View full text Add to dashboard Cite

Efficient Hyperparameter Optimization of Deep Learning Algorithms Using Deterministic RBF Surrogates

Ilievski¹,

Akhtar²,

Feng³

et al. 2016

Preprint

View full text Add to dashboard Cite

Automatically searching for optimal hyperparameter configurations is of crucial importance for applying deep learning algorithms in practice. Recently, Bayesian optimization has been proposed for optimizing hyperparameters of various machine learning algorithms. Those methods adopt probabilistic surrogate models like Gaussian processes to approximate and minimize the validation error function of hyperparameter values. However, probabilistic surrogates require accurate estimates of sufficient statistics (e.g., covariance) of the error distribution and thus need many function evaluations with a sizeable number of hyperparameters. This makes them inefficient for optimizing hyperparameters of deep learning algorithms, which are highly expensive to evaluate. In this work, we propose a new deterministic and efficient hyperparameter optimization method that employs radial basis functions as error surrogates. The proposed mixed integer algorithm, called HORD, searches the surrogate for the most promising hyperparameter values through dynamic coordinate search and requires many fewer function evaluations. HORD does well in low dimensions but it is exceptionally better in higher dimensions. Extensive evaluations on MNIST and CIFAR-10 for four deep neural networks demonstrate HORD significantly outperforms the well-established Bayesian optimization methods such as GP, SMAC and TPE. For instance, on average, HORD is more than 6 times faster than GP-EI in obtaining the best configuration of 19 hyperparameters.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ilija Ilievski

SWIM: A Simple Word Interaction Model for Implicit Discourse Relation Recognition

Efficient Hyperparameter Optimization for Deep Learning Algorithms Using Deterministic RBF Surrogates

Generative Attention Model with Adversarial Self-learning for Visual Question Answering

Personalized news recommendation based on implicit feedback

Efficient Hyperparameter Optimization of Deep Learning Algorithms Using Deterministic RBF Surrogates

Contact Info

Product

Resources

About