In many applications, data appear with a huge number of instances as well as features. Linear Support Vector Machines (SVM) is one of the most popular tools to deal with such large-scale sparse data. This paper presents a novel dual coordinate descent method for linear SVM with L1-and L2-loss functions. The proposed method is simple and reaches an -accurate solution in O(log(1/ )) iterations. Experiments indicate that our method is much faster than state of the art solvers such as Pegasos, TRON, SVM perf , and a recent primal coordinate descent implementation.
The effects of the variation of chemical mechanical planarization (CMP) process parameters on slurry hydrodynamics and removal rate are studied using physically based models. The two models which are developed to describe and fundamentally understand the CMP process are (i) the lubrication model for slurry flow and (ii) the mass transport model for material removal. The mass transport model is developed for copper CMP. Conditions for stable operation and reduced wafer scratching are identified from the lubrication model. The mass transport model takes into account the chemical reaction at the wafer surface, the slurry flow hydrodynamics, and the presence of abrasive particles. The polish rates predicted by the model agree well with those measured experimentally.
Gaussian processes are powerful regression models specified by parameterized mean and covariance functions. Standard approaches to choose these parameters (known by the name hyperparameters) are maximum likelihood and maximum a posteriori. In this article, we propose and investigate predictive approaches based on Geisser's predictive sample reuse (PSR) methodology and the related Stone's cross-validation (CV) methodology. More specifically, we derive results for Geisser's surrogate predictive probability (GPP), Geisser's predictive mean square error (GPE), and the standard CV error and make a comparative study. Within an approximation we arrive at the generalized cross-validation (GCV) and establish its relationship with the GPP and GPE approaches. These approaches are tested on a number of problems. Experimental results show that these approaches are strongly competitive with the existing approaches.
Efficient training of direct multi-class formulations of linear Support Vector Machines is very useful in applications such as text classification with a huge number examples as well as features. This paper presents a fast dual method for this training. The main idea is to sequentially traverse through the training set and optimize the dual variables associated with one example at a time. The speed of training is enhanced further by shrinking and cooling heuristics. Experiments indicate that our method is much faster than state of the art solvers such as bundle, cutting plane and exponentiated gradient methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.