caterpillar: Pine processionary caterpillar dataset

Description

The caterpillar dataset is extracted from a 1973 study on pine processionary caterpillars. The response variable is the log transform of the number of nests per unit. There are $p=8$ potential explanatory variables and $n=33$ areas.

Usage

data(caterpillar)

Arguments

Format

A data frame with 33 observations on the following 9 variables.

x1: altitude (in meters)
x2: slope (in degrees)
x3: number of pine trees in the area
x4: height (in meters) of the tree sampled at the center of the area
x5: orientation of the area (from 1 if southbound to 2 otherwise)
x6: height (in meters) of the dominant tree
x7: number of vegetation strata
x8: mix settlement index (from 1 if not mixed to 2 if mixed)
y: logarithmic transform of the average number of nests of caterpillars per tree

Source

Tomassone, R., Dervin, C., and Masson, J.P. (1993) Biometrie: modelisation de phenomenes biologiques. Dunod, Paris.

Details

This dataset is used in Chapter 3 on linear regression. It assesses the influence of some forest settlement characteristics on the development of caterpillar colonies. It was first published and studied in Tomassone et al. (1993). The response variable is the logarithmic transform of the average number of nests of caterpillars per tree in an area of 500 square meters (which corresponds to the last column in caterpillar). There are $p=8$ potential explanatory variables defined on $n=33$ areas.

Examples

Run this code

data(caterpillar)
summary(caterpillar)

Run the code above in your browser using DataLab