
Last chance! 50% off unlimited learning
Sale ends in
rgam
is used to obtain an outlier-robust fit for generalized
additive models. It uses the backfitting algorithm with weights
derived from robust quasi-likelihood equations. Currently, only
local regression smoothers are supported. Bandwidth selection using
robust and non-robust cross-validation criteria is currently only
implemented for models with a single covariate.
rgam(x, y, family = c("poisson", "binomial"), ni = NULL, epsilon = 1e-08,
max.it = 50, k = 1.5, trace = FALSE, cv.method = c("rcv", "cv", "dcv",
"rdcv"), alpha = seq(0.1, 0.9, by = 0.1), s.i = NULL)
a vector or matrix of covariates
a vector of responses
a character string indicating the assumed
distribution of the response (conditional on the covariates).
Only ‘poisson’ and ‘binomial’ are implemented. The link
function is currently chosen to be the canonical link for
the selected family (log
for ‘poisson’ and logit
for
‘binomial’)
a vector of the same length as y
containing
the number of tries of the binomial distribution of
each entry of y. Only relevant if the argument family
equals ‘binomial’
tolerance for the convergence of the robust local scoring algorithm
maximum number of robust local scoring iterations
tuning constant for the robust quasi-likelihood
score equations. Large values of k
make the estimators
closer to the classical fit (and hence less robust),
while smaller values of k
produce a more robust fit.
Values between 1.5 and 3 generally result in a fit with
good robustness properties
logical flag to turn on debugging output
character string indicating which
cross-validation criterion is to be mimized to select
the bandwidth from the list given in the argument
alphas
. Accepted values are ‘rcv’ (for a weighted
squared loss where the effect of outliers is reduced);
‘cv’ (for the “classical” squared loss); ‘dcv’ (for
the classical deviance loss); ‘rdcv’ (for a robustly
weighted deviance loss). See the references for more
details
a scalar (for models with a single
covariate it can be a vector of numbers) between 0
and 1. If length(alphas)==1
, its value is used as
bandwidth for the local regression smoother, as
described in loess
. If alphas is a vector, then
the value that minimizes the cross-validation
criterion specified in the argument ‘cv’ is used.
optional matrix of initial values for the additive predictors (including the intercept). If missing the predictors are initialized at zero and the intercept is taken to be the transformed sample mean of the responses.
returns an object of class rgam
. It
contains the following components:
the additive fit, the
sum of the columns of the $smooth
component
the fitted mean values, obtained by transforming the component 'additive.predictors' using the inverse link function
the matrix of smooth terms, columns correspond to the smooth predictors in the model
number of robust local scoring iterations used
last relative change of the additive predictors
a logical value indicating whether
the algorithm stopped due to the relative change
of consecutive additive predictors being less than
the tolerance specified in the epsilon
argument
(TRUE) or because the maximum number of iterations
(in the argument max.it
) was reached (FALSE)
the candidate bandwidth values that were considered
a character string indicating the cross-validation method used to choose the bandwidth of the smoother
a vector of the cross-validation criteria values obtained with each entry of the argument alpha
the value in the argument alpha
that produced the smallest cross-validation
criterion. This is the bandwidth used for the
reported fit.
The gam
model is fit using the robust local scoring
algorithm, which iteratively fits weighted additive
models by backfitting. The weights are derived from
robust quasi-likelihood estimating equations and thus
effectively reduce the potentially damaging effect of
outliers.
Currently, this function only implements local regression
smoothers (as calculated by loess
). The method can be
applied to other smoothers as well.
Azadeh, A. and Salibian-Barrera, M. (2011). An outlier-robust fit for Generalized Additive Models with applications to disease outbreak detection. To appear in the Journal of the American Statistical Association.
# NOT RUN {
x <- ili.visits$week
y <- ili.visits$visits
set.seed(123)
x <- x + rnorm(x, mean=0, sd=.01)
#
# the following command needs to run over 890 fits
# and takes about 22 mins on an Intel Xeon CPU (3.2GHz)
#
# a <- rgam(x=x, y=y, family='poisson', cv.method='rcv',
# epsilon=1e-5, alpha=12:20/80, max.it=500)
#
# the optimal is found at alpha = 17/80
#
a <- rgam(x=x, y=y, family='poisson', cv.method='rcv',
epsilon=1e-7, alpha=17/80, max.it=500)
pr.rgam.a <- predict(a, type='response')
plot(x, y, xlab='Week', ylab='ILI visits', pch=19, col='grey75')
lines(x[order(x)], pr.rgam.a[order(x)], lwd=3, col='red')
# }
Run the code above in your browser using DataLab