RXsigns: Normal-theory Maximum Likelihood Estimation of Beta Coefficients with "Correct" Signs

Description

RXsigns displays the Beta Coefficient estimate, denoted by B(=), that is most likely to have minimum MSE risk in the one, unknown direction PARALLEL to the true Beta in p-dimensional likelihood space. Shrinkage to ZERO of any components ORTHOGONAL to the true Beta is MSE optimal. Obenchain(1978) shows that B(=) is of the form k * X'y where the scalar k is given by equation (4.2) on page 1118, the optimal shrinkage factors are proportional to known eigenvalues, and the formula for the maximum likelihood estimate of k given on page 1119 is corrected.

Usage

RXsigns(form, data)

Arguments

form

A regression formula [y~x1+x2+...] suitable for use with lm().

data

Data frame containing observations on all variables in the formula.

Value

An output list object of class RXsigns:

data

Name of the data.frame object specified as the second argument.

form

The regression formula specified as the first argument.

Number of regression predictor variables.

Number of complete observations after removal of all missing values.

Numerical value of R-square goodness-of-fit statistic.

Numerical value of the residual mean square estimate of error.

prinstat

Listing of principal statistics (p by 5) from RXridge().

kpb

Maximum likelihood estimate of k-factor in equation (4.2) of Obenchain(1978).

bmf

Rescaling factor for B(=) to minimize the Residual Sum-of-Squares.

signs

Listing of five Beta coefficient statistics (p by 5): OLS, X'y, Delta, B(=) and Bfit.

loff

Lack-of-Fit statistics: Residual Sum-of-Squares for OLS, X'y, B(=) and Bfit.

mcor

Squared Correlation between the y-vector and its predicted values. The two values displayed are for OLS predictions or for predictions using Bfit, X'y or B(=). These two values are the familiar R^2 goodness-of-fit statistics for OLS and Bfit.

Details

Ill-conditioned (nearly multi-collinear) regression models can produce Ordinary Least Squares (OLS) regression coefficient estimates with numerical signs that differ from those of the X'y vector. This is disturbing because X'y contains the sample correlations between the X-predictor variables and y-response variable if these variables have first been "centered" by subtracting off their mean values and then rescaled to vectors of length one. Besides displaying the B(=) estimate, the RXsigns() function also displays the OLS vector, the "correlation form" of X'y, the estimated Delta shrinkage-factors and the rescaled coefficients, Bfit = f * B(=), where f is the positive scalar that minimizes the Residual Sum-of-Squares; RSS(Bfit) >= RSS(OLS).

References

Obenchain RL. (1978) Good and Optimal Ridge Estimators. Annals of Statistics 6, 1111-1121.

Obenchain RL. (2005) Shrinkage Regression: ridge, BLUP, Bayes, spline and Stein. Electronic book-in-progress (185+ pages.) http://localcontrolstatistics.org

Obenchain RL. (2018) RXshrink_in_R.PDF RXshrink package vignette-like file. http://localcontrolstatistics.org

Examples

Run this code

# NOT RUN {
  data(longley2)
  form <- GNP~GNP.deflator+Unemployed+Armed.Forces+Population+Year+Employed
  rxsobj <- RXsigns(form, data=longley2)
  rxsobj
  str(rxsobj)
# }

Run the code above in your browser using DataLab