locfit.raw: Local Regression, Likelihood and Density Estimation.

Description

locfit.raw is an interface to Locfit using numeric vectors (for a model-formula based interface, use locfit). Although this function has a large number of arguments, most users are likely to need only a small subset.

The first set of arguments (x, y, weights, cens, and base) specify the regression variables and associated quantities.

Another set (scale, alpha, deg, kern, kt, acri and basis) control the amount of smoothing: bandwidth, smoothing weights and the local model. Most of these arguments are deprecated - they'll currently still work, but should be provided through the lp() model term instead.

deriv and dc relate to derivative (or local slope) estimation.

family and link specify the likelihood family.

xlim and renorm may be used in density estimation.

ev specifies the evaluation structure or set of evaluation points.

maxk, itype, mint, maxit and debug control the Locfit algorithms, and will be rarely used.

geth and sty are used by other functions calling locfit.raw, and should not be used directly.

Usage

locfit.raw(x, y, weights=1, cens=0, base=0,
  scale=FALSE, alpha=0.7, deg=2, kern="tricube", kt="sph",
    acri="none", basis=list(NULL),
  deriv=numeric(0), dc=FALSE,
  family, link="default",
  xlim, renorm=FALSE,
  ev=rbox(),
  maxk=100, itype="default", mint=20, maxit=20, debug=0,
  geth=FALSE, sty="none")

Arguments

Vector (or matrix) of the independent variable(s). Can be constructed using the lp() function.

Response variable for regression models. For density families, y can be omitted.

weights

Prior weights for observations (reciprocal of variance, or sample size).

cens

Censoring indicators for hazard rate or censored regression. The coding is 1 (or TRUE) for a censored observation, and 0 (or FALSE) for uncensored observations.

base

Baseline parameter estimate. If provided, the local regression model is fitted as \(Y_i = b_i + m(x_i) + \epsilon_i\), with Locfit estimating the \(m(x)\) term. For regression models, this effectively subtracts \(b_i\) from \(Y_i\). The advantage of the base formulation is that it extends to likelihood regression models.

scale

Deprecated - see lp().

alpha

Deprecated - see lp(). A single number (e.g. alpha=0.7) is interpreted as a nearest neighbor fraction. With two componentes (e.g. alpha=c(0.7,1.2)), the first component is a nearest neighbor fraction, and the second component is a fixed component. A third component is the penalty term in locally adaptive smoothing.

deg

Degree of local polynomial. Deprecated - see lp().

kern

Weight function, default = "tcub". Other choices are "rect", "trwt", "tria", "epan", "bisq" and "gauss". Choices may be restricted when derivatives are required; e.g. for confidence bands and some bandwidth selectors.

Kernel type, "sph" (default); "prod". In multivariate problems, "prod" uses a simplified product model which speeds up computations.

acri

Deprecated - see lp().

basis

User-specified basis functions.

deriv

Derivative estimation. If deriv=1, the returned fit will be estimating the derivative (or more correctly, an estimate of the local slope). If deriv=c(1,1) the second order derivative is estimated. deriv=2 is for the partial derivative, with respect to the second variable, in multivariate settings.

Derivative adjustment.

family

Local likelihood family; "gaussian"; "binomial"; "poisson"; "gamma" and "geom". Density and rate estimation families are "dens", "rate" and "hazard" (hazard rate). If the family is preceded by a 'q' (for example, family="qbinomial"), quasi-likelihood variance estimates are used. Otherwise, the residual variance (rv) is fixed at 1. The default family is "qgauss" if a response y is provided; "density" if no response is provided.

link

Link function for local likelihood fitting. Depending on the family, choices may be "ident", "log", "logit", "inverse", "sqrt" and "arcsin".

xlim

For density estimation, Locfit allows the density to be supported on a bounded interval (or rectangle, in more than one dimension). The format should be c(ll,ul) where ll is a vector of the lower bounds and ur the upper bounds. Bounds such as \([0,\infty)\) are not supported, but can be effectively implemented by specifying a very large upper bound.

renorm

Local likelihood density estimates may not integrate exactly to 1. If renorm=T, the integral will be estimated numerically and the estimate rescaled. Presently this is implemented only in one dimension.

The evaluation structure, rbox() for tree structures; lfgrid() for grids; dat() for data points; none() for none. A vector or matrix of evaluation points can also be provided, although in this case you may prefer to use the smooth.lf() interface to Locfit. Note that arguments flim, mg and cut are now given as arguments to the evaluation structure function, rather than to locfit.raw() directly (change effective 12/2001).

maxk

Controls space assignment for evaluation structures. For the adaptive evaluation structures, it is impossible to be sure in advance how many vertices will be generated. If you get warnings about `Insufficient vertex space', Locfit's default assigment can be increased by increasing maxk. The default is maxk=100.

itype

Integration type for density estimation. Available methods include "prod", "mult" and "mlin"; and "haz" for hazard rate estimation problems. The available integration methods depend on model specification (e.g. dimension, degree of fit). By default, the best available method is used.

mint

Points for numerical integration rules. Default 20.

maxit

Maximum iterations for local likelihood estimation. Default 20.

debug

If > 0; prints out some debugging information.

geth

Don't use!

sty

Deprecated - see lp().

Value

An object with class "locfit". A standard set of methods for printing, ploting, etc. these objects is provided.

References

Consult the Web page http://www.locfit.info/.