locpoly: Local polynomial fit.

Description

This function performs a local polynomial fit of up to order 3 to bivariate data. It returns estimated values of the regression function as well as estimated partial derivatives up to order 3.

Usage

locpoly(x, y, z, xo = seq(min(x), max(x), length = nx), yo = seq(min(y),
 max(y), length = ny), nx = 40, ny = 40, input = "points", output = "grid",
 h = 0, kernel = "uniform", solver = "QR", degree = 3, pd = "")

Arguments

vector of \(x\)-coordinates of data points.

Missing values are not accepted.

vector of \(y\)-coordinates of data points.

Missing values are not accepted.

vector of \(z\)-values at data points.

Missing values are not accepted.

x, y, and z must be the same length

If output="grid" (default): sequence of \(x\) locations for rectangular output grid, defaults to nx points between min(x) and max(x).

If output="points": vector of \(x\) locations for output points.

If output="grid" (default): sequence of \(y\) locations for rectangular output grid, defaults to ny points between min(y) and max(y).

If output="points": vector of \(y\) locations for output points. In this case it has to be same length as xo.

input

text, possible values are "grid" (not yet implemented) and "points" (default).

This is used to distinguish between regular and irregular gridded data.

output

text, possible values are "grid" (=default) and "points".

If "grid" is choosen then xo and yo are interpreted as vectors spanning a rectangular grid of points \((xo[i],yo[j])\), \(i=1,...,nx\), \(j=1,...,ny\). This default behaviour matches how akima::interp works.

In the case of "points" xo and yo have to be of same lenght and are taken as possibly irregular spaced output points \((xo[i],yo[i])\), \(i=1,...,no\) with no=length(xo). nx and ny are ignored in this case.

dimension of output grid in x direction

dimension of output grid in y direction

bandwidth parameter, between 0 and 1. If a scalar is given it is interpreted as ratio applied to the dataset size to determine a local search neighbourhood, if set to 0 a minimum useful search neighbourhood is choosen (e.g. 10 points for a cubic trend function to determine all 10 parameters).

If a vector of length 2 is given both components are interpreted as ratio of the \(x\)- and \(y\)-range and taken as global bandwidth.

kernel

Text value, implemented kernels are uniform (default), epanechnikov and gaussian.

solver

Text value, determines used solver in fastLM algorithm used by this code

Possible values are LLt, QR (default), SVD, Eigen and CPivQR (compare fastLm).

degree

Integer value, degree of polynomial trend, maximum allowed value is 3.

Text value, determines which partial derivative should be returned, possible values are "" (default, the polynomial itself), "x", "y", "xx", "xy", "yy", "xxx", "xxy", "xyy", "yyy" or "all".

Value

If pd="all":

\(x\) coordinates

\(y\) coordinates

estimates of \(z\)

estimates of \(dz/dx\)

estimates of \(dz/dy\)

zxx

estimates of \(d^2z/dx^2\)

zxy

estimates of \(d^2z/dxdy\)

zyy

estimates of \(d^2z/dy^2\)

zxxx

estimates of \(d^3z/dx^3\)

zxxy

estimates of \(d^3z/dx^2dy\)

zxyy

estimates of \(d^3z/dxdy^2\)

zyyy

estimates of \(d^3z/dy^3\)

If pd!="all" only the elements x, y and the desired derivative will be returned, e.g. zxy for pd="xy".

References

Douglas Bates, Dirk Eddelbuettel (2013). Fast and Elegant Numerical Linear Algebra Using the RcppEigen Package. Journal of Statistical Software, 52(5), 1-24. URL http://www.jstatsoft.org/v52/i05/.

Examples

Run this code

# NOT RUN {
## choose a kernel
knl <- "gaussian"

## choose global and local bandwidth 
bwg <- 0.25 # *100% of x- y-range
bwl <- 0.1  # *100% of data set

## a bivariate polynomial of degree 5:
f <- function(x,y) 0.1+ 0.2*x-0.3*y+0.1*x*y+0.3*x^2*y-0.5*y^2*x+y^3*x^2+0.1*y^5

## degree of model
dg=3 

## part 1:
## regular gridded data:
ng<- 21 # x/y size of a square data grid

## build and fill the grid with the theoretical values:

xg<-seq(0,1,length=ng)
yg<-seq(0,1,length=ng)

# xg and yg as matrix matching fg
nx <- length(xg)
ny <- length(yg)
xx <- t(matrix(rep(xg,ny),nx,ny))
yy <- matrix(rep(yg,nx),ny,nx)

fg   <- outer(xg,yg,f)

## local polynomial estimate
## global bw:
ttg <- system.time(pdg <- locpoly(xg,yg,fg,
  input="grid", pd="all", h=c(bwg,bwg), solver="QR", degree=dg, kernel=knl))
## time used:
ttg

## local bw:
ttl <- system.time(pdl <- locpoly(xg,yg,fg,
  input="grid", pd="all", h=bwl, solver="QR", degree=dg, kernel=knl))
## time used:
ttl

image(pdg$x,pdg$y,pdg$z)
contour(pdl$x,pdl$y,pdl$zx,add=TRUE,lty="dotted")
contour(pdl$x,pdl$y,pdl$zy,add=TRUE,lty="dashed")
points(xx,yy,pch=".")


## part 2:
## irregular data,
## results will not be as good as with the regular 21*21=231 points.

nd<- 41 # size of data set

## random irregular data
oldseed <- set.seed(42)
x<-runif(ng)
y<-runif(ng)
set.seed(oldseed)

z <- f(x,y)

## global bw:
ttg <- system.time(pdg <- interp::locpoly(x,y,z, xg,yg, pd="all",
  h=c(bwg,bwg), solver="QR", degree=dg,kernel=knl))

ttg

## local bw:
ttl <- system.time(pdl <- interp::locpoly(x,y,z, xg,yg, pd="all",
  h=bwl, solver="QR", degree=dg,kernel=knl))

ttl

image(pdg$x,pdg$y,pdg$z)
contour(pdl$x,pdl$y,pdl$zx,add=TRUE,lty="dotted")
contour(pdl$x,pdl$y,pdl$zy,add=TRUE,lty="dashed")
points(x,y,pch=".")

# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

References

See Also

Examples