Learn R Programming

dr (version 2.0.4)

dr: Dimension reduction regression

Description

The function dr implements dimension reduction methods, including SIR, SAVE and pHd. 'dr' calls 'dr.compute', so only the former will be needed by most users.

Usage

dr (formula, data, subset, na.action = na.fail, weights, 
    ...)
    
dr.compute (x, y, weights, method = "sir", ...)

Arguments

formula
a symbolic description of the model to be fit. The details of the model are the same as for lm. Although factors may not be appropriate for dr methods, they are permitted. Full rank models are recommended, although rank
data
an optional data frame containing the variables in the model. By default the variables are taken from the environment from which `dr' is called.
subset
an optional vector specifying a subset of observations to be used in the fitting process.
weights
an optional vector of weights to be used where appropriate. In the context of dimension reduction methods, weights are used to obtain elliptical symmetry, not constant variance; see dr.weight
na.action
a function which indicates what should happen when the data contain `NA's. The default is `na.fail,' which will stop calculations. The option 'na.omit' is also permitted, but it may not work correctly when weights are use
x
The design matrix
y
The response vector or matrix
method
This character string specifies the method of fitting. ``sir" specifies sliced inverse regression and ``save" specifies sliced average variance estimation. ``phdy" uses principal hessian directions using the response as
...
For 'dr', arguments passed to 'dr.compute'. For 'dr.compute', arguments required for particular dimension reduction method. nslices is the number of slices used by sir and save. numdir is the maximu

Value

  • dr returns an object that inherits from dr (the name of the type is the value of the method argument), with attributes:
  • xThe design matrix
  • yThe response vector
  • weightsThe weights used, normalized to add to n.
  • qrQR factorization of x.
  • casesNumber of cases used.
  • callThe initial call to 'dr'.
  • MA matrix that depends on the method of computing. The column space of M should be close to the central subspace.
  • evaluesThe eigenvalues of M (or squared singular values if M is not symmetric).
  • evectorsThe eigenvectors of M (or of M'M if M is not square and symmetric) ordered according to the eigenvalues.
  • numdirThe maximum number of directions to be found. The output value of numdir may be smaller than the input value.
  • slice.infooutput from 'sir.slice', used by sir and save.
  • methodthe dimension reduction method used.
  • dr.weights returns a vector of weights estimated weights, scaled to add to the number of cases.

Details

The general regression problem studies $F(y|x)$, the conditional distribution of a response $y$ given a set of predictors $x$. This function provides methods for estimating the dimension and central subspace of a general regression problem. That is, we want to find a $p \times d$ matrix $B$ such that $$F(y|x)=F(y|B'x)$$ Both the dimension $d$ and the subspace $R(B)$ are unknown. These methods make few assumptions. All the methods available in this function estimate the unknowns by study of the inverse problem, $F(x|y)$. In each, a kernel matrix $M$ is estimated such that the column space of $M$ should be close to the central subspace. Eigenanalysis of $M$ is then used to estimate the central subspace. Objects created using this function have appropriate print, summary and plot methods. Weights can be used, essentially to specify the relative frequency of each case in the data. Empirical weights that make the contours of the weighted sample closer to elliptical can be computed using dr.weights. This will usually result in zero weight for some cases. The function will set zero estimated weights to missing. Several functions are provided that require a dr object as input. dr.permutation.tests uses a permutation test to obtain significance levels for tests of dimension. dr.coplot allows visualizing the results using a coplot of either two selected directions conditioning on a third and using color to mark the response, or the resonse versus one direction, conditioning on a second direction. plot.dr provides the default plot method for dr objects, based on a scatterplot matrix.

References

The details of these methods are given by R. D. Cook (1998). Regression Graphics. New York: Wiley. Equivalent methods are also available in Arc, R. D. Cook and S. Weisberg (1999). Applied Regression Including Computing and Graphics, New York: Wiley, www.stat.umn.edu/arc.

See Also

dr.permutation.test,dr.x,dr.y, dr.direction,dr.coplot,dr.weights

Examples

Run this code
library(dr)
data(ais)
attach(ais)  # the Australian athletes data
#fit dimension reduction using sir
m1 <- dr(LBM~Wt+Ht+RCC+WCC, method="sir", nslices = 8)
summary(m1)
# repeat, using save:
m2 <- update(m1,method="save")
summary(m2)
# repeat, using phd:
m3 <- update(m2, method="phdres")
summary(m3)
# repeat, using weights:
w1 <- dr.weights(LBM~Wt+Ht+RCC+WCC, covmethod="mve")
m4 <- dr(LBM~Wt+Ht+RCC+WCC, method="sir", nslices = 8, weights=w1)

Run the code above in your browser using DataLab