
Last chance! 50% off unlimited learning
Sale ends in
Tps(x, Y, m = NULL, p = NULL, scale.type = "range", lon.lat = FALSE, miles = TRUE, ...)fastTps(x, Y, m = NULL, p = NULL, theta, lon.lat=FALSE, ...)
Krig
and mKrig
functions. See the help on each of these
for more information on the calling arguments and what is returned. A thin plate spline is result of minimizing the residual sum of squares subject to a constraint that the function have a certain level of smoothness (or roughness penalty). Roughness is quantified by the integral of squared m-th order derivatives. For one dimension and m=2 the roughness penalty is the integrated square of the second derivative of the function. For two dimensions the roughness penalty is the integral of
(Dxx(f))**22 + 2(Dxy(f))**2 + (Dyy(f))**22
(where Duv denotes the second partial derivative with respect to u and v.) Besides controlling the order of the derivatives, the value of m also determines the base polynomial that is fit to the data. The degree of this polynomial is (m-1).
The smoothing parameter controls the amount that the data is smoothed. In the usual form this is denoted by lambda, the Lagrange multiplier of the minimization problem. Although this is an awkward scale, lambda =0 corresponds to no smoothness constraints and the data is interpolated. lambda=infinity corresponds to just fitting the polynomial base model by ordinary least squares.
This estimator is implemented by passing the right generalized covariance function based on radial basis functions to the more general function Krig. One advantage of this implementation is that once a Tps/Krig object is created the estimator can be found rapidly for other data and smoothing parameters provided the locations remain unchanged. This makes simulation within R efficient (see example below). Tps does not currently support the knots argument where one can use a reduced set of basis functions. This is mainly to simplify and a good alternative using knots would be to use a valid covariance from the Matern family and a large range parameter.
See also the mKrig function for handling larger data sets and also for an example of combining Tps and mKrig for evaluation on a larger grid.
The function fastTps
is really a convenient wrapper function that
calls mKrig
with the Wendland covariance function. This is
experimental and some care needs to exercised in specifying the taper
range and power ( p
) which describes the polynomial behavior of
the Wendland at the origin. Note that unlike Tps the locations are not
scaled to unit range and this can cause havoc in smoothing problems with
variables in very different units. So rescaling the locations x<- scale(x)
is a good idea for putting the variables on a common scale for smoothing.
This function does have the potential to approximate estimates of Tps
for very large spatial data sets. See wendland.cov
and help on
the SPAM package for more background.
surface.Krig
,
sreg
#2-d example
fit<- Tps(ozone$x, ozone$y) # fits a surface to ozone measurements.
set.panel(2,2)
plot(fit) # four diagnostic plots of fit and residuals.
set.panel()
# summary of fit and estiamtes of lambda the smoothing parameter
summary(fit)
surface( fit) # Quick image/contour plot of GCV surface.
# NOTE: the predict function is quite flexible:
look<- predict( fit, lambda=2.0)
# evaluates the estimate at lambda =2.0 _not_ the GCV estimate
# it does so very efficiently from the Krig fit object.
look<- predict( fit, df=7.5)
# evaluates the estimate at the lambda values such that
# the effective degrees of freedom is 7.5
# compare this to fitting a thin plate spline with
# lambda chosen so that there are 7.5 effective
# degrees of freedom in estimate
# Note that the GCV function is still computed and minimized
# but the lambda values used correpsonds to 7.5 df.
fit1<- Tps(ozone$x, ozone$y,df=7.5)
set.panel(2,2)
plot(fit1) # four diagnostic plots of fit and residuals.
# GCV function (lower left) has vertical line at 7.5 df.
set.panel()
# The basic matrix decompositions are the same for
# both fit and fit1 objects.
# predict( fit1) is the same as predict( fit, df=7.5)
# predict( fit1, lambda= fit$lambda) is the same as predict(fit)
# predict onto a grid that matches the ranges of the data.
out.p<-predict.surface( fit)
image( out.p)
# the surface function (e.g. surface( fit)) essentially combines
# the two steps above
# predict at different effective
# number of parameters
out.p<-predict.surface( fit,df=10)
#A 1-d example with confidence intervals
out<-Tps( rat.diet$t, rat.diet$trt) # lambda found by GCV
out
plot( out$x, out$y)
xgrid<- seq( min( out$x), max( out$x),,100)
fhat<- predict( out,xgrid)
lines( xgrid, fhat,)
SE<- predict.se( out, xgrid)
lines( xgrid,fhat + 1.96* SE, col="red", lty=2)
lines(xgrid, fhat - 1.96*SE, col="red", lty=2)
#
# compare to the ( much faster) B spline algorithm
# sreg(rat.diet$t, rat.diet$trt)
# Here is a 1-d example with 95 percent CIs where sreg would not
# work:
# sreg would give the right estimate here but not the right CI's
x<- seq( 0,1,,8)
y<- sin(3*x)
out<-Tps( x, y) # lambda found by GCV
plot( out$x, out$y)
xgrid<- seq( min( out$x), max( out$x),,100)
fhat<- predict( out,xgrid)
lines( xgrid, fhat, lwd=2)
SE<- predict.se( out, xgrid)
lines( xgrid,fhat + 1.96* SE, col="red", lty=2)
lines(xgrid, fhat - 1.96*SE, col="red", lty=2)
# Adding a covariate to the fixed part of model
# Note: this is a fairly big problem numerically (850+ locations)
set.panel( 1,2)
# without elevation covariate
Tps( RMprecip$x,RMprecip$y)-> out0
surface( out0)
US( add=TRUE, col="grey")
# with elevation covariate
Tps( RMprecip$x,RMprecip$y, Z=RMprecip$elev)-> out
# NOTE: out$d[4] is the estimated elevation coeficient
out.p<-predict.surface( out, drop.Z=TRUE)
surface( out.p)
US( add=TRUE, col="grey")
# decomposing into the different pieces:
fit<- predict( out)
fit0<- predict( out, drop.Z=TRUE, just.fixed=TRUE)
fit.elev<- predict( out, just.fixed=TRUE) - fit0
fit1<- predict(out, drop.Z=TRUE) - fit0
set.panel( 2,2)
quilt.plot( out$x, fit0)
title("spatial drift")
quilt.plot( out$x, fit.elev)
title("elevation")
quilt.plot( out$x, fit1)
title("spatial nonparametric")
quilt.plot( out$x, fit)
US( add=TRUE)
title("full prediction")
set.panel()
###
### fast Tps
# m=2 p= 2m-d= 2
#
# Note: theta =3 degrees is a very generous taper range.
# Use some trial theta value with rdist.nearest to determine a
# a useful taper. Some empirical studies suggest that in the
# interpolation case in 2 d the taper should be large enough to
# about 20 non zero nearest neighbors for every location.
set.panel( 1,2)
fastTps( RMprecip$x,RMprecip$y,m=2,lambda= 1e-2, theta=3.0) -> out2
# note that fastTps produces an mKrig object so one can use all the
# the overloaded functions that are defined for the mKrig class.
# summary of what happened note estimate of effective degrees of
# freedom
print( out2)
surface( out2)
#
# now use great circle distance for this smooth
# note the different "theta" for the taper support ( there are
# about 70 miles in one degree of latitude).
#
fastTps( RMprecip$x,RMprecip$y,m=2,lambda= 1e-2,lon.lat=TRUE, theta=210) -> out3
print( out3) # note the effective degrees of freedom is different.
surface(out3)
set.panel()
#
# simulation reusing Tps/Krig object
#
fit<- Tps( rat.diet$t, rat.diet$trt)
true<- fit$fitted.values
N<- length( fit$y)
temp<- matrix( NA, ncol=50, nrow=N)
sigma<- fit$shat.GCV
for ( k in 1:50){
ysim<- true + sigma* rnorm(N)
temp[,k]<- predict(fit, y= ysim)
}
matplot( fit$x, temp, type="l")
#
#4-d example
fit<- Tps(BD[,1:4],BD$lnya,scale.type="range")
# plots fitted surface and contours
# default is to hold 3rd and 4th fixed at median values
surface(fit)
Run the code above in your browser using DataLab