aws: Local polynomial Adaptive Weights Smoothing for regression with additive errors

Description

This function implements a local polynomial Adaptive Weights Smoothing procedure for regression problems with additive errors as described in Polzehl & Spokoiny (2003). This function superseedes functions awsuni, awsbi and awstri.

Usage

aws(y, x = NULL, p = 0, sigma2 = NULL, qlambda = NULL, eta = 0.5, 
    tau = NULL, lkern = "Triangle", hinit = NULL, hincr = NULL, 
    hmax = 10, NN = FALSE, u = NULL, graph = FALSE, demo = FALSE, 
    symmetric = NULL, wghts= NULL)

Arguments

y contains the observed values (regression function plus errors). In case of x=NULL (second parameter) y is assumed to be observed on a one, two or three-dimensional grid. The dimension of

x is either NULL, in this case y is assumed to be observed on a grid, or is a matrix, with rows corresponding to variables, containing the design points where y is observed.

p is the degree of the polynomial model to use. For univariate regression p can be an nonnegative integer less or equal than 5, for bivariate regression on a grid p can be either 0,

sigma2

sigma2 can be used to provide an estimate for the error variance. If is.null(sigma2) a variance estimate is generated from the data.

qlambda

qlambda determines the scale parameter qlambda for the stochastic penalty. The scaling parameter in the stochastic penalty lambda is choosen as the qlambda-quantile of a

eta

eta is a memory parameter used to stabilize the procedure. eta has to be between 0 and 1, with eta=.5 being the default.

tau

tau is used in case of a the scale parameter polynomial degree p!=0 only. It is the scale parameter in the extention penalty used to prevent from leverage problems. The default value for tau

lkern

lkern determines the location kernel to be used. Options are "Uniform", "Triangle", "Quadratic", "Cubic" and "Exponential". Default is "Triangle"

hinit

hinit Initial bandwidth for the location penalty. Appropriate value is choosen in case of hinit=NULL

hincr

hincr hincr^(1/d), with d the dimensionality of the design, is used as a factor to increase the bandwidth between iterations. Defauts to hincr=1.25

hmax

hmax Maximal bandwidth to be used. Determines the number of iterations and is used as the stopping rule.

If NN=TRUE use nearest neighbor-rules instead of distances in the location term.

u used to supply values of the true regression function for test purposes to calculate Mean Squared Error (MSE) and Mean Absolute Error (MAE)

graph

graph if TRUE results are displayed after each iteration step.

demo

demo if TRUE after each iteration step results are displayed and the process waits for user interaction.

symmetric

If symmetric==TRUE the stochastic penalty is symmetrized, i.e. (sij + sji)/lambda is used instead of sij/lambda. See references for details. symmetric==FALSE is forced if

wghts

Specifies wghts for distance evaluation on a bi- or trivariate grid. Allows for anisotropic local neighborhoods. If wghts=NULL isotropic neighborhoods are used.

Value

thetaParameter estimates, first dimension corresponds to parameter components
yvalues provided in y
xvalues provided in x
callactual function call

Details

This function implements an adaptive weights smoothing (AWS) procedure for local polynomial models with additive errors. The approach generalizes the original AWS procedure from Polzehl and Spokoiny (2000).

Adaptive weights smoothing is an iterative data adaptive smoothing technique that is designed for smoothing in regression problems with discontinuous regression function. The basic assumption is that the regression function can be approximated by a simple local, e.g. local constant or local polynomial, model. The estimate of the regression function, i.e. the conditional expectation of y given x is computed as a weighted maximum likelihood estimate, with weights choosen in a completely data adaptive way. The procedure is edge preserving. If the assumed local model is globally valid, almost all weights used will be 1, i.e. the resulting estimate almost is the global estimate.

Currently implemented are the following models (specified by parameter p and attributes of x and y) are implemented:

[object Object],[object Object],[object Object],[object Object],[object Object]

The essential parameter in the procedure is qlambda. This parameter has an interpretation as a significance level of a test for equivalence of two local parameter estimates. Optimal values mainly depend on the choosen p and the value of symmetric (determines the use of an asymmetric or a symmetrized test). The optimal values only slightly depend on the model parameters, i.e. the default parameters should work in most situations. Larger values of qlambda may lead to oversmoothing, small values of qlambda lead to a random segmentation of homogeneous regions. A good value of qlambda can be obtained by the propagation condition, requiring that in case of global validity of the local model the estimate for large hmax should be equal to the global estimate.

The numerical complexity of the procedure is mainly determined by hmax. The number of iterations is d*log(hmax/hinit)/log(hincr) with d being the dimension of y. Comlexity in each iteration step is Const*hakt*n with hakt being the actual bandwith in the iteration step and n the number of design points. hmax determines the maximal possible variance reduction.

References

{ Polzehl, J. and Spokoiny, V. (2003). Varying coefficient regression modeling by adaptive weights smoothing, WIAS-Preprint 818. } { Polzehl, J. and Spokoiny, V. (2000). Adaptive Weights Smoothing with applications to image restoration, J.R.Statist.Soc. B, 62, Part 2, pp.335-354 }

Examples

Run this code

#######################################################
#                                                     
#                  univariate examples                
#                                                     
####################################################### 
#  
#    Blocks data (Example 6 from Fan & Gijbels (1996)
#
     mofx6 <- function(x){
     xj <- c(10,13,15,23,25,40,44,65,76,78,81)/100
     hj <- c(40,-50,30,-40,50,-42,21,43,-31,21,-42)*.37
     Kern <- function(x) (1-sign(x))/2
     apply(Kern(outer(xj,x,"-"))*hj,2,sum)
     }
#
#    sigma==1   step by step with graphics
#
     fx6 <- mofx6(seq(0,1,1/2047))
     y <- rnorm(fx6,fx6,1)
     tmp <- aws(y,p=0,hmax=100,graph=TRUE)
     par(mfrow=c(1,1),mar=c(3,3,2.5,.5),mgp=c(2,1,0))
     plot(seq(0,1,1/2047),y)
     lines(seq(0,1,1/2047),tmp$theta,col=3,lwd=2)
     lines(seq(0,1,1/2047),fx6,col=2)
#
#    sigma==3   without graphics (much faster)
#
     y <- rnorm(fx6,fx6,3)
     tmp <- aws(y,hmax=500)
     par(mfrow=c(1,1),mar=c(3,3,2.5,.5),mgp=c(2,1,0))
     plot(seq(0,1,1/2047),y)
     lines(seq(0,1,1/2047),tmp$theta,col=3,lwd=2)
     lines(seq(0,1,1/2047),fx6,col=2)
     rm(mofx6,fx6,y,tmp)
#  
#    second example from Polzehl and Spokoiny (2002)
#
     f2 <- function(x) sin(2*pi*1.2/(x+.2))
     n <- 1000
     x <- seq(0,1,length=n)
     fx2 <- f2(x)
     set.seed(1)
     sigma <- .25
     y <- rnorm(x,fx2,sigma)
#   increase hmax to 2 for good results
     ex1p0 <- aws(y,x,p=0,hmax=.1)$theta[1,]
     ex1p1 <- aws(y,x,p=1,hmax=.1)$theta[1,]
     ex1p2 <- aws(y,x,p=2,hmax=.1)$theta[1,]
     ex1p3 <- aws(y,x,p=3,hmax=.1)$theta[1,]
     par(mfrow=c(2,2),mar=c(2.5,2.5,2.5,.5),mgp=c(1.5,.5,0))
     plot(x,y)
     lines(x,fx2,col=2)
     lines(x,ex1p0,col=3,lwd=2)
     title("local constant AWS")
     plot(x,y)
     lines(x,fx2,col=2)
     lines(x,ex1p1,col=3,lwd=2)
     title("local linear AWS")
     plot(x,y)
     lines(x,fx2,col=2)
     lines(x,ex1p2,col=3,lwd=2)
     title("local quadratic AWS")
     plot(x,y)
     lines(x,fx2,col=2)
     title("local cubic AWS")
     lines(x,ex1p3,col=3,lwd=2)
     rm(f2,fx2,x,sigma,y,ex1p0,ex1p1,ex1p2,ex1p3)
#######################################################
#                                                     
#                  bivariate examples                 
#                                                     
####################################################### 
#  
#  Local constant image from Polzehl and Spokoiny (2000)
#
     xy <- rbind(rep(0:255,256),rep(0:255,rep(256,256)))
     indw <- c(1:12,29:48,73:100,133:168,209:256)
     w0 <- matrix(rep(FALSE,256*256),ncol=256)
     w0[indw,] <- TRUE
     w0[,indw] <- !w0[,indw]
     w0 <- w0-.5
     
     w0[((xy[1,]-129)^2+(xy[2,]-129)^2)<=10000&((xy[1,]-129)^2+(xy[2,]-129)^2)>=4900] <- 0
     w0[abs(xy[1,]-xy[2,])<=20&((xy[1,]-129)^2+(xy[2,]-129)^2)<4900] <- 0
     w0[((xy[1,]-225)^2+2*(xy[2,]-30)^2)-(xy[1,]-225)*(xy[2,]-30)<=625] <- 0
     
     w0[((xy[1,]-225)^2+2*(xy[2,]-30)^2)-(xy[1,]-225)*(xy[2,]-30)<=625&xy[2,]>27&xy[2,]<31] <- -.5
     
     w0[((xy[1,]-225)^2+2*(xy[2,]-30)^2)-(xy[1,]-225)*(xy[2,]-30)<=625&xy[1,]>223&xy[1,]<227] <- .5
     w0[((xy[2,]-225)^2+2*(xy[1,]-30)^2)+(xy[2,]-225)*(xy[1,]-30)<=625] <- 0
     
     w0[((xy[2,]-225)^2+2*(xy[1,]-30)^2)+(xy[2,]-225)*(xy[1,]-30)<=625&xy[1,]>27&xy[1,]<31] <- -.5
     
     w0[((xy[2,]-225)^2+2*(xy[1,]-30)^2)+(xy[2,]-225)*(xy[1,]-30)<=625&xy[2,]>223&xy[2,]<227] <- .5
     w0[((xy[2,]-225)^2+(xy[1,]-225)^2)+1*(xy[2,]-225)*(xy[1,]-225)<=400] <- 0
     w0[((xy[2,]-30)^2+(xy[1,]-30)^2)<=256] <- 0
     rm(xy,indw)
     set.seed(1)
     sigma <- .5
     y <- w0+rnorm(w0,0,sigma)
#    run some steps interactively  (increase hmax)
     tmp <- aws(y,graph=TRUE,hmax=2,demo=TRUE)
#    now without graphics and larger hmax
#   increase hmax for better results
     tmp <- aws(y,hmax=5)
     par(mfrow=c(1,3))
     image(y,col=gray((0:255)/255))
     image(tmp$theta,zlim=range(y),col=gray((0:255)/255))
     image(w0,zlim=range(y),col=gray((0:255)/255))
     rm(y,w0,tmp,sigma)
#
#   piecewise smooth image from Polzehl and Spokoiny (2003)
#
     x <- (0:99)/99
     fbi <- function(x,y) (x^2+y^3)*sign(x^2-1*x*y-.75*y^3)
     z0 <- outer(2*(x-.5),2*(x-.25),FUN=fbi)
     z <- z0+rnorm(z0,0,.5)
     par(mfrow=c(1,3),mar=c(3,3,3,.5),mgp=c(2,1,0))
     set.seed(1)
     persp(x,x,z0,phi=15,theta=150,d=5,r=2,expand=1.5,col="white")
     title("True function")
     persp(x,x,z,phi=15,theta=150,d=5,r=2,expand=1.5,col="white")
     title("Observed values")
     image(x,x,z,col=gray((0:255)/255))
     title("Observed values")
#
#   local constant
#
     ex1p0 <- aws(z,hmax=3,symmetric=FALSE)$theta # use hmax=5 or larger
#
#   local linear
#
     ex1p1 <- aws(z,p=1,hmax=3)$theta[1,,]  # use hmax=12 or larger
#
#   local quadratic
#
     ex1p2 <- aws(z,p=2,hmax=3)$theta[1,,]  # use hmax=20 or larger
#
#   display results
#
     par(mfrow=c(2,2),mar=c(0.25,.5,.5,.5),mgp=c(.5,.5,0))
     persp(x,x,z,phi=15,theta=150,d=5,r=2,expand=1,xlab="",ylab="",zlab="",box=FALSE)
     title("original data",line=-3)
     persp(x,x,ex1p0,phi=15,theta=150,d=5,r=2,expand=1,xlab="",ylab="",zlab="",box=FALSE)
     title("local constant AWS",line=-3)
     persp(x,x,ex1p1,phi=15,theta=150,d=5,r=2,expand=1,xlab="",ylab="",zlab="",box=FALSE)
     title("local linear AWS (small hmax)",line=-3)
     persp(x,x,ex1p2,phi=15,theta=150,d=5,r=2,expand=1,xlab="",ylab="",zlab="",box=FALSE)
     title("local quadratic AWS (small hmax)",line=-3)
     rm(x,fbi,z0,ex1p0,ex1p1,ex1p2)

#######################################################
#
#                  three-variate examples
#
#######################################################
#
#  Local constant image from Polzehl and Spokoiny (2000)
#
     xy <- rbind(rep(0:30,31),rep(0:30,rep(31,31)))
     w3 <- array(0,c(31,31,31))
     w3[4:28,4:28,4:28] <- 1
     dim(w3) <- c(961,31)
     w3[((xy[1,]-15)^2+(xy[2,]-15)^2)<=144,16] <- 0
     for(i in 1:12) {
        r2 <- 144-i*i
        w3[((xy[1,]-15)^2+(xy[2,]-15)^2)<=r2,16+c(-i,i)] <- 0
     }
     dim(w3) <- c(31,31,31)
     w3[10:22,10:22,10:22] <- 1
     dim(w3) <- c(961,31)
     w3[((xy[1,]-15)^2+(xy[2,]-15)^2)<=36,16] <- 0
     for(i in 1:6) {
        r2 <- 36-i*i
        w3[((xy[1,]-15)^2+(xy[2,]-15)^2)<=r2,16+c(-i,i)] <- 0
     }
     dim(w3) <- c(31,31,31)
     rm(xy)
     sigma <- .5
     set.seed(1)
     y <- w3+rnorm(w3,0,sigma)
#   increase hmax for reasonable results (hmax >=5)
     tmp <- aws(y,hmax=2)
     par(mfrow=c(1,3))
     for(i in 14:18){
     image(y[,,i],col=gray((0:255)/255))
     image(tmp$theta[,,i],zlim=range(y),col=gray((0:255)/255))
     image(w3[,,i],zlim=range(y),col=gray((0:255)/255))
#     readline()
     }
     rm(w3,y,tmp,sigma)

Run the code above in your browser using DataLab