pointwise: Pointwise Confidence Limits for Predictions

Description

Computes pointwise confidence limits for predictions computed by the function predict.

Usage

pointwise(results.predict, coverage = 0.99, 
    simultaneous = FALSE, individual = FALSE)

Arguments

results.predict

output from a call to predict with se.fit=TRUE.

coverage

optional numeric scalar between 0 and 1 indicating the confidence level associated with the confidence limits. The default value is coverage=0.99.

simultaneous

optional logical scalar indicating whether to base the confidence limits for the predicted values on simultaneous or non-simultaneous prediction limits. The default value is simultaneous=FALSE.

individual

optional logical scalar indicating whether to base the confidence intervals for the predicted values on prediction limits for the mean (individual=FALSE) or prediction limits for an individual observation (individual=TRUE). The default value is individual=FALSE.

Value

a list with the following components:

upper

upper limits of pointwise confidence intervals.

fit

surface values. This is the same as the component fit of the argument results.predict.

lower

lower limits of pointwise confidence intervals.

Details

This function computes pointwise confidence limits for predictions computed by the function predict. The limits are computed at those points specified by the argument newdata of predict.

The predict function is a generic function with methods for several different classes. The funciton pointwise was part of the S language. The modifications to pointwise in the package EnvStats involve confidence limits for predictions for a linear model (i.e., an object of class "lm").

Confidence Limits for a Predicted Mean Value (individual=FALSE). Consider a standard linear model with \(p\) predictor variables. Often, one of the major goals of regression analysis is to predict a future value of the response variable given known values of the predictor variables. The equations for the predicted mean value of the response given fixed values of the predictor variables as well as the equation for a two-sided (1-\(\alpha\))100% confidence interval for the mean value of the response can be found in Draper and Smith (1998, p.80) and Millard and Neerchal (2001, p.547).

Technically, this formula is a confidence interval for the mean of the response for one set of fixed values of the predictor variables and corresponds to the case when simultaneous=FALSE. To create simultaneous confidence intervals over the range of of the predictor variables, the critical t-value in the equation has to be replaced with a critical F-value and the modified formula is given in Draper and Smith (1998, p. 83), Miller (1981a, p. 111), and Millard and Neerchal (2001, p. 547). This formula is used in the case when simultaneous=TRUE.

Confidence Limits for a Predicted Individual Value (individual=TRUE). In the above section we discussed how to create a confidence interval for the mean of the response given fixed values for the predictor variables. If instead we want to create a prediction interval for a single future observation of the response variable, the fomula is given in Miller (1981a, p. 115) and Millard and Neerchal (2001, p. 551).

Technically, this formula is a prediction interval for a single future observation for one set of fixed values of the predictor variables and corresponds to the case when simultaneous=FALSE. Miller (1981a, p. 115) gives a formula for simultaneous prediction intervals for \(k\) future observations. If we are interested in creating an interval that will encompass all possible future observations over the range of the preictor variables with some specified probability however, we need to create simultaneous tolerance intervals. A formula for such an interval was developed by Lieberman and Miller (1963) and is given in Miller (1981a, p. 124). This formula is used in the case when simultaneous=TRUE.

References

Chambers, J.M., and Hastie, T.J., eds. (1992). Statistical Models in S. Chapman and Hall/CRC, Boca Raton, FL.

Draper, N., and H. Smith. (1998). Applied Regression Analysis. Third Edition. John Wiley and Sons, New York, Chapter 3.

Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with S-PLUS. CRC Press, Boca Raton, FL, pp.546-553.

Miller, R.G. (1981a). Simultaneous Statistical Inference. Springer-Verlag, New York, pp.111, 124.

Examples

Run this code

# NOT RUN {
  # Using the data in the built-in data frame Air.df, 
  # fit the cube root of ozone as a function of temperature. 
  # Then compute predicted values for ozone at 70 and 90 
  # degrees F, and compute 95% confidence intervals for the 
  # mean value of ozone at these temperatures.

  # First create the lm object 
  #---------------------------

  ozone.fit <- lm(ozone ~ temperature, data = Air.df) 


  # Now get predicted values and CIs at 70 and 90 degrees 
  #------------------------------------------------------

  predict.list <- predict(ozone.fit, 
    newdata = data.frame(temperature = c(70, 90)), se.fit = TRUE) 

  pointwise(predict.list, coverage = 0.95) 
  # $upper
  #        1        2 
  # 2.839145 4.278533 

  # $fit
  #        1        2 
  # 2.697810 4.101808 

  # $lower
  #        1        2 
  # 2.556475 3.925082 

  #--------------------------------------------------------------------

  # Continuing with the above example, create a scatterplot of ozone 
  # vs. temperature, and add the fitted line along with simultaneous 
  # 95% confidence bands.

  x <- Air.df$temperature 

  y <- Air.df$ozone 

  dev.new()
  plot(x, y, xlab="Temperature (degrees F)",  
    ylab = expression(sqrt("Ozone (ppb)", 3))) 

  abline(ozone.fit, lwd = 2) 

  new.x <- seq(min(x), max(x), length=100) 

  predict.ozone <- predict(ozone.fit, 
    newdata = data.frame(temperature = new.x), se.fit = TRUE) 

  ci.ozone <- pointwise(predict.ozone, coverage=0.95, 
    simultaneous=TRUE) 

  lines(new.x, ci.ozone$lower, lty=2, lwd = 2, col = 2) 

  lines(new.x, ci.ozone$upper, lty=2, lwd = 2, col = 2) 

  title(main=paste("Cube Root Ozone vs. Temperature with Fitted Line", 
    "and Simultaneous 95% Confidence Bands", 
    sep="\n")) 

  #--------------------------------------------------------------------

  # Redo the last example by creating non-simultaneous 
  # confidence bounds and prediction bounds as well.

  dev.new()
  plot(x, y, xlab = "Temperature (degrees F)", 
    ylab = expression(sqrt("Ozone (ppb)", 3))) 

  abline(ozone.fit, lwd = 2) 

  new.x <- seq(min(x), max(x), length=100) 

  predict.ozone <- predict(ozone.fit, 
    newdata = data.frame(temperature = new.x), se.fit = TRUE) 

  ci.ozone <- pointwise(predict.ozone, coverage=0.95) 

  lines(new.x, ci.ozone$lower, lty=2, col = 2, lwd = 2) 

  lines(new.x, ci.ozone$upper, lty=2, col = 2, lwd = 2) 

  pi.ozone <- pointwise(predict.ozone, coverage = 0.95, 
    individual = TRUE)

  lines(new.x, pi.ozone$lower, lty=4, col = 4, lwd = 2) 

  lines(new.x, pi.ozone$upper, lty=4, col = 4, lwd = 2) 

  title(main=paste("Cube Root Ozone vs. Temperature with Fitted Line", 
    "and 95% Confidence and Prediction Bands", 
    sep="\n")) 

  #--------------------------------------------------------------------

  # Clean up
  rm(predict.list, ozone.fit, x, y, new.x, predict.ozone, ci.ozone, 
    pi.ozone)
# }

Run the code above in your browser using DataLab