wps: Warped-Plane Spline-Based Isotonic Regression

Description

Isotonic regression surface in two dimensions using warped-plane splines is fitted by the wps routine without using additivity assumptions. Meanwhile, linear covariate effects can also be incorporated in the model. The surface and covariate effects are estimated simultaneously with a single cone projection. A penalized spline fit is also provided which allows for more knots and greater flexibility.

Usage

wps(formula, family = gaussian(), data = NULL, weights = NULL, pnt = FALSE, 
pen = 0, cpar = 1.5)

Arguments

formula

A formula object which gives a symbolic description of the model to be fitted. It has the form "response ~ predictor". The response is a vector of length $n$. The user is supposed to indicate the relationship between the mean value $E(y)$ and the two pred

Value

k1Knots used for $x_1$.
k2Knots used for $x_2$.
muhatThe estimated constrained mean value.
muhatuThe estimated unconstrained mean value.
SSE1The sum of squared residuals for the full model.
SSE0The sum of squared residuals for the linear part.
edfThe constrained effective degrees of freedom.
edfuThe unconstrained effective degrees of freedom.
deltaA matrix whose columns are the edges corresponding to the isotonically modelled predictors $x_1$ and $x_2$.
zmatA matrix whose columns represent the parametrically modelled covariate. The user can choose to include a constant vector in it or not. It must have full column rank.
xmatA matrix whose columns are $x_1$ and $x_2$.
coefsThe estimated constrained coefficients for the basis spanning the null space of the constraint set and edges corresponding to isotonically modelled predictors $x_1$ and $x_2$.
coefsuThe estimated unconstrained coefficients for the basis spanning the null space of the constraint set and edges corresponding to $x_1$ and $x_2$.
zcoefsThe estimated coefficients for the parametrically modelled covariate.
pvals.betaThe approximate p-values for the estimation of the vector $\beta$. A t-distribution is used as the approximate distribution.
se.betaThe standard errors for the estimation of the vector $\beta$.
gcvThe generalized cross validation (GCV) value for the constrained fit.
gcvuThe generalized cross validation (GCV) value for the unconstrained fit.
xnmsA vector storing the names of $x_1$ and $x_2$.
znmsA vector storing the names of the parametrically modelled covariate.
zidA vector keeping track of the positions of the parametrically modelled covariate.
valsA vector storing the levels of each variable used as a factor.
zid1A vector keeping track of the beginning position of the levels of each variable used as a factor.
zid2A vector keeping track of the end position of the levels of each variable used as a factor.
ynmThe name of the response variable.
decrsA vector of two logical values indicating the monotonicity of the isotonically-constrained surface with respect to $x_1$ and $x_2$.
tmsThe terms objects extracted by the generic function terms from a wps fit. See the official help page (http://stat.ethz.ch/R-manual/R-patched/library/stats/html/terms.html) of the terms function for more details.
is_paramA logical scalar showing if or not a variable is a parametrically modelled covariate, which could be a linear term or a factor.
is_facA logical scalar showing if or not a variable is a factor.
callThe matched call.

item

family
data
weights
pnt
pen
cpar

eqn

$\sigma^2 = SSE / (n - cpar * edf)$

Details

We consider the regression model $y_i = f(t_{1i}, t_{2i}) + z_i'\beta +\varepsilon_{i}, i = 1,\ldots,n$, where $\beta$ is a $p$ by $1$ parameter vector, and the $\varepsilon_i's$ are mean-zero random errors. We know a priori that $f$ is continuous and isotonic in both dimensions; that is, for fixed $t_{1}$ and $z$ values, if $t_{2a} \leq t_{2b}$, then $f(t_{1}, t_{2a}) \leq f(t_{1}, t_{2a})$, and similarly for fixed $t_{2}$ and $z$ values, $f$ is non-decreasing in $t_{1}$. For splines of degree two or higher, obtaining a finite set of linear inequality constraints that are necessary and sufficient for isotonicity in both dimensions does not seem to be feasible. However, if we use linear spline basis functions, then the necessary and sufficient constraints are straight-forward and the fitted surface can be described as a continuous piece-wise warped plane, called a "warped-plane spline" (WPS) surface.

The surface and covariate effects are estimated simultaneously with a single cone projection (no back-fitting). See references cited in this section and the official manual(https://cran.r-project.org/package=coneproj)for the R package coneproj for more details.

A penalized spline version is also provided. Over each knot rectangle, the regression surface is a warped plane, and the slopes can change abruptly from one rectangle to the next. To obtain smoother fits, and to side-step the problem of knot choices, we can use large number of knots for both predictors and penalize these changes in slopes. The size of the penalty parameter will control the effective degrees of freedom of the fit. In practice, the penalty term can be chosen through generalized cross-validation, similar to the method in Meyer (2012).

References

Meyer, M. C. and M. Woodroofe (2000) On the degrees of freedom in shape-restricted regression. Annals of Statistics 28, 1083--1104.

Meyer, M. C. (2012) Constrained penalized splines. Canadian Journal of Statistics 40(1), 190--206.

Meyer, M. C. (2016) Estimation and inference for isotonic regression in two dimensions, using warped-plane splines.

Examples

Run this code

library(MASS)
  data(Rubber)

  # regress loss on hard and tens under the shape-restriction: "doubly-decreasing" 
  # with a penalty term equal to 1
  # use 13 knots for each predictor
  ans <- wps(loss ~ dd(hard, tens, numknots = c(13, 13)), data = Rubber, pen = 1)

  # make a 3D plot of the constrained surface
  plotpersp(ans, hard, tens, data = Rubber)

Run the code above in your browser using DataLab