Learn R Programming

simest (version 0.4-1-1)

smooth.pen.reg: Penalized Smooth/Smoothing Spline Regression

Description

Estimate the non-parameteric regression function using smoothing splines.

Usage

smooth.pen.reg(x, y, lambda, w = NULL, agcv = FALSE, agcv.iter = 100, ...)

# S3 method for smooth.pen.reg plot(x, diagnostics = TRUE, ylab = quote(y ~ "and" ~ hat(y) ~ " values"), pch = "*", cex = 1, lwd = 2, col2 = "red", ablty = 4, ...) # S3 method for smooth.pen.reg print(x, digits = getOption("digits"), ...) # S3 method for smooth.pen.reg predict(object, newdata = NULL, deriv = 0, ...)

Value

An object of class smooth.pen.reg, basically a list including the elements

x.values

sorted x values provided as input.

y.values

corresponding y values in input.

fit.values

corresponding fit values of same length as that of x.values.

deriv

corresponding values of the derivative of same length as that of x.values.

iter

Always set to 1.

residuals

residuals obtained from the fit.

minvalue

minimum value of the objective function attained.

convergence

Always set to 0.

agcv.score

Asymptotic GCV approximation. Proposed in Silverman (1982) as a computationally fast approximation to GCV.

splinefun

An object of class smooth.spline needed for predict.

Arguments

x

a numeric vector giving the values of the predictor variable. For plot and print methods, x is an object of class smooth.pen.reg.

y

a numeric vector giving the values of the response variable.

lambda

a numeric value giving the penalty value.

w

an optional numeric vector of the same length as x; Defaults to all 1.

agcv

a logical denoting if an estimate of generalized cross-validation is needed.

agcv.iter

a numeric denoting the number of random vectors used to estimate the GCV. See ‘Details’.

...

additional arguments, passed further, e.g., to matplot or plot.default for the plot() method.

diagnostics

for the plot() method; if true, as by default, produce diagnostics, notably residual plots additionally.

ylab, pch, cex, lwd, col2, ablty

further optional argument to the plot() method; the last two for the color and line type of some plot components.

digits

the number of significant digits, for numbers in the print() method.

object

the result of smooth.pen.reg(), of class smooth.pen.reg.

newdata

a matrix of new data points for the predict method.

deriv

either 0 or 1, the order of the derivative to evaluate.

Author

Arun Kumar Kuchibhotla

Details

The function minimizes $$\sum_{i=1}^n w_i(y_i - f(x_i))^2 + \lambda\int\{f''(x)\}^2\; dx$$ without any constraint on \(f\).

This function implements in R the algorithm noted in Green and Silverman(1994). The function smooth.spline in R is not suitable for single index model estimation as it chooses \(\lambda\) using GCV by default.

plot function provides the scatterplot along with fitted curve; it also includes some diagnostic plots for residuals. Predict function now allows computation of the first derivative. Calculation of generalized cross-validation requires the computation of diagonal elements of the hat matrix involved which is cumbersone and is computationally expensive (and also is unstable).

smooth.Pspline() in the pspline package provides the GCV criterion value which matches the usual GCV when all the weights are equal to 1 but is not clear what it is for weights unequal. We use an estimate of GCV (formula of which is given in Green and Silverman (1994)) proposed by Girard which is very stable and computationally cheap. For more details about this randomized GCV, see Girard (1989).

References

Green, P. J. and Silverman, B. W. (1994) Non-parametric Regression and Generalized Linear Models: A Roughness Penalty Approach. Chapman and Hall.

Girard, D. A. (1989) A Fast Monte-Carlo Cross-Validation Procedure for Large Least Squares Problems with Noisy Data. Numerische Mathematik 56, 1--23.

Examples

Run this code
args(smooth.pen.reg)
x <- runif(50,-1,1)
y <- x^2 + 0.3 * rnorm(50)
smP <- smooth.pen.reg(x, y, lambda = 0.01, agcv = TRUE)
smP # -> print() method
plot(smP)
predict(smP, newdata = rnorm(10, 0,0.1))

Run the code above in your browser using DataLab