AirPollution

0th

Percentile

Air Pollution and Mortality

Data relating air pollution and mortality, frequently used for illustrations in ridge regression and related tasks.

Keywords
datasets
Usage
data("AirPollution")
Format

A data frame containing 60 observations on 16 variables.

precipitation

Average annual precipitation in inches.

temperature1

Average January temperature in degrees Fahrenheit.

temperature7

Average July temperature in degrees Fahrenheit.

age

Percentage of 1960 SMSA population aged 65 or older.

household

Average household size.

education

Median school years completed by those over 22.

housing

Percentage of housing units which are sound and with all facilities.

population

Population per square mile in urbanized areas, 1960.

noncauc

Percentage of non-Caucasian population in urbanized areas, 1960.

whitecollar

Percentage employed in white collar occupations.

income

Percentage of families with income < USD 3000.

hydrocarbon

Relative hydrocarbon pollution potential.

nox

Relative nitric oxides potential.

so2

Relative sulphur dioxide potential.

humidity

Annual average percentage of relative humidity at 13:00.

mortality

Total age-adjusted mortality rate per 100,000.

References

McDonald GC, Schwing RC (1973). Instabilities of Regression Estimates Relating Air Pollution to Mortality. Technometrics, 15, 463--482.

Miller AJ (2002). Subset Selection in Regression. New York: Chapman and Hall.

Aliases
  • AirPollution
Examples
# NOT RUN {
## load data (with logs for relative potentials)
data("AirPollution", package = "lmSubsets")
for (i in 12:14)  AirPollution[[i]] <- log(AirPollution[[i]])

## fit subsets
lm_all <- lmSubsets(mortality ~ ., data = AirPollution)
plot(lm_all)

## refit best model
lm6 <- refit(lm_all, size = 6)
summary(lm6)
# }
Documentation reproduced from package lmSubsets, version 0.5-1, License: GPL (>= 3)

Community examples

Looks like there are no examples yet.