np.est: Nonparametric estimate of the regression function

Description

This routine computes estimates for $m(newt_j)$ ($j=1,...,J$) from a sample ${(Y_i, t_i): i=1,...,n}$, where: $$Y_i= m(t_i) + \epsilon_i.$$ The regression function, $m$, is a smooth but unknown function, and the random errors, ${\epsilon_i}$, are allowed to be time series. Kernel smoothing is used.

Usage

np.est(data = data, h.seq = NULL, newt = NULL,
estimator = "NW", kernel = "quadratic")

Arguments

data

data[, 1] contains the values of the response variable, $Y$; data[, 2] contains the values of the explanatory variable, $t$.

h.seq

the considered bandwidths. If NULL (the default), only one bandwidth, selected by means of the cross-validation procedure, is used.

newt

values of the explanatory variable where the estimates are obtained. If NULL (the default), the considered values will be the values of data[,2].

estimator

allows us the choice between NW (Nadaraya-Watson) or LLP (Local Linear Polynomial). The default is NW.

kernel

allows us the choice between gaussian, quadratic (Epanechnikov kernel), triweight or uniform kernel. The default is quadratic.

Value

YHAT: a length(newt) x length(h.seq) matrix containing the estimates for $m(newt_j)$ ($j=1,...,$length(newt)) using the different bandwidths in h.seq.

Details

See Fan and Gijbels (1996) and Francisco-Fernandez and Vilar-Fernandez (2001).

References

Fan, J. and Gijbels, I. (1996) Local Polynomial Modelling and its Applications. Chapman and Hall, London. Francisco-Fernandez, M. and Vilar-Fernandez, J. M. (2001) Local polynomial regression estimation with correlated errors. Comm. Statist. Theory Methods 30, 1271-1293.

Examples

Run this code

# EXAMPLE 1: REAL DATA
data <- matrix(10,120,2)
data(barnacles1)
barnacles1 <- as.matrix(barnacles1)
data[,1] <- barnacles1[,1]
data <- diff(data, 12)
data[,2] <- 1:nrow(data)

aux <- np.gcv(data)
h <- aux$h.opt
ajuste <- np.est(data=data, h=h)
plot(data[,2], ajuste, type="l", xlab="t", ylab="m(t)")
plot(data[,1], ajuste, xlab="y", ylab="y.hat", main="y.hat vs y")
abline(0,1)
residuos <- data[,1] - ajuste
mean(residuos^2)/var(data[,1])



# EXAMPLE 2: SIMULATED DATA
## Example 2a: independent data

set.seed(1234)
# We generate the data
n <- 100
t <- ((1:n)-0.5)/n
m <- function(t) {0.25*t*(1-t)}
f <- m(t)

epsilon <- rnorm(n, 0, 0.01)
y <-  f + epsilon
data_ind <- matrix(c(y,t),nrow=100)

# We estimate the nonparametric component of the PLR model
# (CV bandwidth)
est <- np.est(data_ind)
plot(t, est, type="l", lty=2, ylab="")
points(t, 0.25*t*(1-t), type="l")
legend(x="topleft", legend = c("m", "m hat"), col=c("black", "black"), lty=c(1,2))

Run the code above in your browser using DataLab