Calculates different types of residuals, influence measures and leverages for a nonlinear heteroscedastic model.
nlreg.diag(fitted, hoa = TRUE, infl = TRUE, trace = FALSE)
a nlreg
object, that is, the result of a call to
nlreg
.
logical value indicating whether higher order asymptotics should
be used for calculating the regression diagnostics. Default is
TRUE
.
logical value indicating whether influence measures should be
calculated on the basis of a leave-one-out analysis. Default is
TRUE
.
logical value. If TRUE
, details of the iterations are
printed. Default is FALSE
.
Returns an object of class nlreg.diag
with the following
components:
the fitted values, that is, the mean function evaluated at each data point.
the response (or standardized) residuals from the fit.
the generalized Pearson residuals from the fit.
the approximate studentized residuals from the fit.
the deletion residuals from the fit; only if hoa = TRUE
.
the \(r^*\)-type residuals from the fit; only if
hoa = TRUE
.
the leverages of the observations.
the approximate leverages of the observations.
an approximation to Cook's distance for the regression coefficients.
the global influence of each observation; only for heteroscedastic
errors and if infl = TRUE
.
the partial influence of each observation on the estimates of the
regression coefficients; only for heteroscedastic errors and if
infl = TRUE
.
the partial influence of each observation on the estimates of the
variance parameters; only for heteroscedastic errors and if
infl = TRUE
.
the number of regression coefficients.
If trace = TRUE
, the number of the observation currently
considered in the mean shift outlier model or omitted in the
leave-one-out analysis (see Details section above) is printed;
only if hoa = TRUE
or infl = TRUE
.
This function is based on A. J. Canty's function
glm.diag
contained in library boot
.
The regression diagnostics implemented in the nlreg.diag
routine follow two approaches. The first exploits, where possible,
the analogy with linear models, that is, it applies the classical
definitions of residuals, leverages and Cook's distance after having
linearized the nonlinear model through Taylor series expansion
(Carroll and Ruppert, 1988, Section 2.8). The second
approach uses the mean shift outlier model (Cook and Weisberg,
1982, Section 2.2.2), where a dummy variable is included for each
observation at a time, the model refitted and the significance of
the corresponding coefficient assessed.
The leverages are defined in analogy to the linear case (Brazzale, 2000, Appendix A.2.2). Two versions are available. In the first case the sub-block of the inverse of the expected information matrix corresponding to the regression coefficients is used in the definition. In the second case, this matrix is replaced by the inverse of \(M'WM\), where \(M\) is the \(n\times p\) matrix whose \(i\)th row is the gradient of the mean function evaluated at the ith data point and \(W\) is a diagonal matrix whose elements are the inverses of the variance function evaluated at each data point.
If the model is correctly specified, all residuals follow the standard normal distribution. The second kind of leverages described above are used to calculate the approximate studentized residuals, whereas the generalized Pearson residuals use the first kind. The \(i\)th generalized Pearson residual can also be obtained as the score statistic for testing the significance of the dummy coefficient in the mean shift outlier model for observation \(i\). Accordingly, the \(i\)th deletion and \(r^*\)-type residuals are defined as respectively the likelihood root and modified likelihood root statistics (\(r\) and \(r^*\)) for the same situation (Bellio, 2000, Section 2.6.1).
Different influence measures were implemented in
nlreg.diag
. If infl = TRUE
, the global measure
(Cook and Weisberg, 1982, Section 5.2) and two partial ones
(Bellio, 2000, Section 2.6.2), the first measuring the
influence of each observation on the regression coefficients and the
second on the variance parameters, are returned. They are calculated
through a leave-one-out analysis, where one observation at a time is
deleted and the model refitted. In order to avoid a further model
fit, the constrained maximum likelihood estimates that would be
needed are approximated by means of a Taylor series expansion
centered at the MLEs. If infl = FALSE
, only an
approximation to Cook's distance, obtained from a first order Taylor
series expansion of the partial influence measure for the regression
coefficients, is returned.
A detailed account of regression diagnostics can be found in Davison and Snell (1991) and Davison and Tsai (1992). The details and in particular the definitions of the above residuals and diagnostics are given in Brazzale (2000, Section 6.3.1 and Appendix A.2.2).
Bellio, R. (2000) Likelihood Asymptotics: Applications in Biostatistics. Ph.D. Thesis, Department of Statistics, University of Padova.
Brazzale, A. R. (2000) Practical Small-Sample Parametric Inference. Ph.D. Thesis N. 2230, Department of Mathematics, Swiss Federal Institute of Technology Lausanne.
Carroll, R. J. and Ruppert, D. (1988) Transformation and Weighting in Regression. London: Chapman & Hall.
Cook, R. D. and Weisberg, S. (1982) Residuals and Influence in Regression. New York: Chapman & Hall.
Davison, A. C. and Snell, E. J. (1991) Residuals and diagnostics. In Statistical Theory and Modelling: In Honour of Sir David Cox (eds. D. V. Hinkley, N. Reid, and E. J. Snell), 83--106. London: Chapman & Hall.
Davison, A. C. and Tsai, C.-L. (1992) Regression model diagnostics. Int. Stat. Rev., 60, 337--353.
# NOT RUN {
library(boot)
data(calcium)
calcium.nl <- nlreg( cal ~ b0*(1-exp(-b1*time)), weights = ~ ( 1+time^g )^2,
data=calcium, start = c(b0 = 4, b1 = 0.1, g = 1),
hoa = TRUE )
##
calcium.diag <- nlreg.diag( calcium.nl )
plot( calcium.diag, which = 9 )
##
calcium.diag <- nlreg.diag( calcium.nl, hoa = FALSE, infl = FALSE)
plot(calcium.diag, which = 9)
## Not available
# }
Run the code above in your browser using DataLab