weightedLoess: Lowess fit with weighting

Description

Fit robust lowess curves of degree 1 to weighted covariates and responses.

Usage

weightedLowess(x, y, weights = rep(1, length(y)), delta=NULL, npts = 200, span = 0.3, iterations = 4)

Arguments

a numeric vector of covariates

a numeric vector of response values

weights

a numeric vector containing frequency weights for each covariate

delta

a numeric scalar specifying the maximum distance between adjacent points

npts

an integer scalar specifying the approximate number of points to use when computing delta

span

a numeric scalar specifying the width of the smoothing window as a proportion of the total weight

iterations

an integer scalar specifying the number of robustifying iterations

Value

A list of numeric vectors for the fitted responses, the residuals, the robustifying weights and the chosen delta.

Details

This function extends the lowess algorithm to handle non-negative prior weights. These weights are used during span calculations such that the span distance for each point must include the specified proportion of all weights. They are also applied during weighted linear regression to compute the fitted value (in addition to the tricube weights determined by span). For integer weights, the prior weights are equivalent to using rep(..., w) on x and y prior to fitting.

For large vectors, running time is reduced by only performing locally weighted regression for several points. Fitted values for all points adjacent to the chosen points are computed by linear interpolation between the chosen points. For this purpose, the first and last points are always chosen. Note that the regression itself uses all (neighbouring) points.

Points are defined as adjacent to a chosen point if the distance to the latter is positive and less than delta. The first chosen point is that corresponding to the smallest covariate; the next chosen point is then the next non-adjacent point, and so on. By default, the smallest delta is chosen to obtain a number of chosen points approximately equal to the specified npts. Increasing npts or supplying a small delta will improve the accuracy of the fit (i.e. closer to the full lowess procedure) at the cost of running time.

Robustification is performed using the magnitude of the residuals. Residuals greater than 6 times the median residual are assigned weights of zero. Otherwise, Tukey's biweight function is applied. Weights are then used for weighted linear regression. Greater values of iterations will provide greater robustness.

References

Cleveland, W.S. (1979). Robust Locally Weighted Regression and Smoothing Scatterplots. Journal of the American Statistical Association 74, 829-836.

Examples

Run this code

y <- rt(100,df=4)
x <- runif(100)
w <- runif(100)
out <- weightedLowess(x, y, w, span=0.7)
plot(x,y,cex=w)
o <- order(x)
lines(x[o],out$fitted[o],col="red")

Run the code above in your browser using DataLab