The mlt package implements maximum likelihood estimation in the class of conditional transformation models. Based on a suitable explicit parameterization of the unconditional or conditional transformation function using infrastructure from package basefun, we show how one can define, estimate, and compare a cascade of increasingly complex transformation models in the maximum likelihood framework. Models for the unconditional or conditional distribution function of any univariate response variable are set-up and estimated in the same computational framework simply by choosing an appropriate transformation function and parameterization thereof. As it is computationally cheap to evaluate the distribution function, models can be estimated by maximization of the exact likelihood, especially in the presence of random censoring or truncation. The relatively dense high-level implementation in the R system for statistical computing allows generalization of many established implementations of linear transformation models, such as the Cox model or other parametric models for the analysis of survival or ordered categorical data, to the more complex situations illustrated in this paper.
Journal of Statistical Software5 transformation model P(Y ≤ y) = Φ(h(y)) = Φ(a(y) ϑ) by maximization of the exact likelihood as follows. After loading package mlt we specify the duration variable we are interested in R> library("mlt") R> var_d <-numeric_var("duration", support = c(1.0, 5.0), + add = c(-1, 1), bounds = c(0, Inf)) This abstract representation refers to a positive and conceptually continuous variable duration. We then set-up a basis function a for this variable in the interval [1, 5] (which can be evaluated in the interval [0, 6] as defined by the add argument), in our case a monotone increasing Bernstein polynomial of order eight (details can be found in Section 2.1) R> B_d <-Bernstein_basis(var = var_d, order = 8, ui = "increasing") The (in our case unconditional) transformation model is now fully defined by the parameterization h(y) = a(y) ϑ and F Z = Φ which is specified using the ctm() function as R> ctm_d <-ctm(response = B_d, todistr = "Normal") Because, in this simple case, the transformation function transforms Y ∼ F Y to Z ∼ F Z = Φ, the latter distribution is specified using the todistr argument. An equidistant grid of 200 duration times in the interval support + add = [0, 6] is generated by R> str(nd_d <-mkgrid(ctm_d, 200)) List of 1 $ duration: num [1:200] 0 0.0302 0.0603 0.0905 0.1206 ...