The main ideas behind the classic multivariate logistic regression model make sense when translated to the functional setting, where the explanatory variable X is a function and the response Y is binary. However, some important technical issues appear (or are aggravated with respect to those of the multivariate case) due to the functional nature of the explanatory variable. First, the mere definition of the model can be questioned: While most approaches so far proposed rely on the $$L^2$$
L
2
-based model, we explore an alternative (in some sense, more general) approach, based on the theory of reproducing kernel Hilbert spaces (RKHS). The validity conditions of such RKHS-based model, and their relation with the $$L^2$$
L
2
-based one, are investigated and made explicit in two formal results. Some relevant particular cases are considered as well. Second, we show that, under very general conditions, the maximum likelihood of the logistic model parameters fails to exist in the functional case, although some restricted versions can be considered. Third, we check (in the framework of binary classification) the practical performance of some RKHS-based procedures, well-suited to our model: They are compared to several competing methods via Monte Carlo experiments and the analysis of real data sets.
A model for the prediction of functional time series is introduced, where observations are assumed to be continuous random functions. We model the dependence of the data with a nonstandard autoregressive structure, motivated in terms of the Reproducing Kernel Hilbert Space (RKHS) generated by the auto-covariance function of the data. The new approach helps to find relevant points of the curves in terms of prediction accuracy. This dimension reduction technique is particularly useful for applications, since the results are usually directly interpretable in terms of the original curves. An empirical study involving real and simulated data is included, which generates competitive results. Supplementary material includes R-Code, tables and mathematical comments.
Mahalanobis distance is a classical tool in multivariate analysis. We suggest here an extension of this concept to the case of functional data. More precisely, the proposed definition concerns those statistical problems where the sample data are real functions defined on a compact interval of the real line. The obvious difficulty for such a functional extension is the non-invertibility of the covariance operator in infinite-dimensional cases. Unlike other recent proposals, our definition is suggested and motivated in terms of the Reproducing Kernel Hilbert Space (RKHS) associated with the stochastic process that generates the data. The proposed distance is a true metric; it depends on a unique real smoothing parameter which is fully motivated in RKHS terms. Moreover, it shares some properties of its finite dimensional counterpart: it is invariant under isometries, it can be consistently estimated from the data and its sampling distribution is known under Gaussian models. An empirical study for two statistical applications, outliers detection and binary classification, is included. The obtained results are quite competitive when compared to other recent proposals of the literature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.