npscoef: Smooth Coefficient Kernel Regression

Description

npscoef computes a kernel regression estimate of a one (1) dimensional dependent variable on $p$-variate explanatory data, using the model $Y_i = W_{i}^{\prime} \gamma (Z_i) + u_i$ where $W_i'=(1,X_i')$, given a set of evaluation points, training points (consisting of explanatory data and dependent data), and a bandwidth specification. A bandwidth specification can be a scbandwidth object, or a bandwidth vector, bandwidth type and kernel type.

Usage

npscoef(bws, ...)
## S3 method for class 'formula':
npscoef(bws, data = NULL, newdata = NULL, \dots)
## S3 method for class 'call':
npscoef(bws, \dots)
## S3 method for class 'default':
npscoef(bws, txdat, tydat, tzdat, \dots)
## S3 method for class 'scbandwidth':
npscoef(bws,
        txdat = stop("training data 'txdat' missing"),
        tydat = stop("training data 'tydat' missing"),
        tzdat = NULL,
        exdat,
        eydat,
        ezdat,
        residuals = FALSE,
        errors = TRUE,
        iterate = TRUE,
        maxiter = 100,
        tol = .Machine$double.eps,
        leave.one.out = FALSE,
        betas = FALSE,
        ...)

Arguments

bws

a bandwidth specification. This can be set as a scbandwidth object returned from an invocation of npscoefbw, or as a vector of bandwidths, with each element $i$ corresponding to the

...

additional arguments supplied to specify the regression type, bandwidth type, kernel types, selection methods, and so on. To do this, you may specify any of bwscaling, bwtype, ckertype, ckerorder

data

an optional data frame, list or environment (or object coercible to a data frame by as.data.frame) containing the variables in the model. If not found in data, the variables are taken fro

newdata

An optional data frame in which to look for evaluation data. If omitted, the training data are used.

txdat

a $p$-variate data frame of explanatory data (training data), which, by default, populates the columns $2$ through $p+1$ of $W$ in the model equation, and in the absence of zdat, will also correspond to $Z$ from the model eq

tydat

a one (1) dimensional numeric or integer vector of dependent data, each element $i$ corresponding to each observation (row) $i$ of txdat. Defaults to the training data used to compute the bandwidth object.

tzdat

a optionally specified $q$-variate data frame of explanatory data (training data), which corresponds to $Z$ in the model equation. Defaults to the training data used to compute the bandwidth object.

exdat

a $p$-variate data frame of points on which the regression will be estimated (evaluation data).By default, evaluation takes place on the data provided by txdat.

eydat

a one (1) dimensional numeric or integer vector of the true values of the dependent variable. Optional, and used only to calculate the true errors.

ezdat

an optionally specified $q$-variate data frame of points on which the regression will be estimated (evaluation data), which corresponds to $Z$ in the model equation. Defaults to be the same as txdat.

errors

a logical value indicating whether or not asymptotic standard errors should be computed and returned in the resulting smoothcoefficient object. Defaults to TRUE.

residuals

a logical value indicating that you want residuals computed and returned in the resulting smoothcoefficient object. Defaults to FALSE.

iterate

a logical value indicating whether or not backfitted estimates should be iterated for self-consistency. Defaults to TRUE.

maxiter

integer specifying the maximum number of times to iterate the backfitted estimates while attempting make the backfitted estimates converge to the desired tolerance. Defaults to 100.

tol

desired tolerance on the relative convergence of backfit estimates. Defaults to .Machine$double.eps.

leave.one.out

a logical value to specify whether or not to compute the leave one out estimates. Will not work if e[xyz]dat is specified. Defaults to FALSE.

betas

a logical value indicating whether or not estimates of the components of $\gamma$ should be returned in the smoothcoefficient object along with the regression estimates. Defaults to FALSE.

Value

npscoef returns a smoothcoefficient object. The generic functions fitted, residuals, coef, se, and predict, extract (or generate) estimated values, residuals, coefficients, bootstrapped standard errors on estimates, and predictions, respectively, from the returned object. Furthermore, the functions summary and plot support objects of this type. The returned object has the following components:
evalevaluation points
meanestimation of the regression function (conditional mean) at the evaluation points
merrif errors = TRUE, standard errors of the regression estimates
betaif betas = TRUE, estimates of the coefficients $\gamma$ at the evaluation points
residif residuals = TRUE, in-sample or out-of-sample residuals where appropriate (or possible)
R2coefficient of determination
MSEmean squared error
MAEmean absolute error
MAPEmean absolute percentage error
CORRabsolute value of Pearson's correlation coefficient
SIGNfraction of observations where fitted and observed values agree in sign

Usage Issues

If you are using data of mixed types, then it is advisable to use the data.frame function to construct your input data and not cbind, since cbind will typically not work as intended on mixed data types and will coerce the data to the same type.

Support for backfitted bandwidths is experimental and is limited in functionality. The code does not support asymptotic standard errors or out of sample estimates with backfitting.

References

Aitchison, J. and C.G.G. Aitken (1976), Multivariate binary discrimination by the kernel method, Biometrika, 63, 413-420.

Cai Z. (2007), Trending time-varying coefficient time series models with serially correlated errors, Journal of Econometrics, 136, 163-188.

Hastie, T. and R. Tibshirani (1993), Varying-coefficient models, Journal of the Royal Statistical Society, B 55, 757-796.

Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.

Li, Q. and J.S. Racine (2010), Smooth varying-coefficient estimation and inference for qualitative and quantitative data, Econometric Theory, 26, 1-31.

Pagan, A. and A. Ullah (1999), Nonparametric Econometrics, Cambridge University Press.

Racine, J.S. and D. Ouyang and Q. Li (2010), Semiparametric Hierarchical Models, manuscript. Wang, M.C. and J. van Ryzin (1981), A class of smooth estimators for discrete distributions, Biometrika, 68, 301-309.

Examples

Run this code

# EXAMPLE 1 (INTERFACE=FORMULA): 

n <- 250
x <- runif(n)
z <- runif(n, min=-2, max=2)
y <- x*exp(z)*(1.0+rnorm(n,sd = 0.2))
bw <- npscoefbw(y~x|z)
model <- npscoef(bw)
plot(model)

# EXAMPLE 1 (INTERFACE=DATA FRAME): 

n <- 250
x <- runif(n)
z <- runif(n, min=-2, max=2)
y <- x*exp(z)*(1.0+rnorm(n,sd = 0.2))
bw <- npscoefbw(xdat=x, ydat=y, zdat=z)
model <- npscoef(bw)
plot(model)

Run the code above in your browser using DataLab