ordinalNet: Penalized ordinal regression

Description

Fits ordinal regression models with elastic net penalty. Supported models include cumulative logit, probit, cauchit, and complementary log-log. The regularization path is computed at a grid of values for the regularizaton parameter lambda. The algorithm uses Fisher Scoring with Coordinate Descent updates.

Usage

ordinalNet(x, y, alpha = 1, standardize = TRUE, penalizeID = NULL,
  positiveID = NULL, link = c("logit", "probit", "cloglog", "cauchit"),
  lambdaVals = NULL, nLambda = 100, lambdaMinRatio = ifelse(nObs < nVar,
  0.01, 1e-04), alphaMin = 0.01, trace = FALSE, epsOut = 0.001,
  epsIn = 0.001, maxiterOut = Inf, maxiterIn = Inf, pMin = 1e-20,
  betaMin = 1e-08, convNorm = Inf, zetaStart = NULL, thetaStart = NULL)

Arguments

Covariate matrix

Response variable, either an ordered factor or a matrix where each row is a multinomial vector of counts

alpha

The elastic net mixing parameter, with $0\le\alpha\le1$. alpha=1 is the lasso penalty, and alpha=0 is the ridge penalty. See "Details".

standardize

If standardize=TRUE, the predictor variables are scaled to have unit variance. Coefficient estimates are returned on the original scale.

penalizeID

Logical vector indicating whether each coefficient should be penalized. Default is TRUE for all coefficients.

positiveID

Logical vector indicating whether each coefficient should be constrained to be non-negative. Default is FALSE for all coefficients.

link

Specifies the link function. The options supported are logit, probit, complementary log-log, and cauchit.

lambdaVals

An optional user-specified lambda sequence. Typical usage is to have the program compute its own lambda sequence based on nLambda and lambdaMinRatio.

nLambda

The number of lambda values in the solution path. Default is 100.

lambdaMinRatio

Smallest value for lambda as a fraction of the maximum lambda. (The maximum lambda is the smallest value that sets all penalized coefficients to zero.)

alphaMin

If alpha < alphaMin, then alphaMin is used to calculate the maximum lambda value.

trace

If trace=TRUE the algorithm progress is printed to the terminal.

epsOut

Convergence threshold for the algorithm's outer loop. The outer loop optimizes optimizes the Fisher Scoring quadratic approximation to the penalized log likelihood.

epsIn

Convergence threshold for the algorithm's inner loop. The inner loop cycles through and updates the coefficient estimates using coordinate descent. Each cycle is one iteration.

maxiterOut

Maximum number of outer loop iterations.

maxiterIn

Maximum number of inner loop iterations.

pMin

If for any observation, the fitted probability for a response category falls below pMin, the algorithm is terminated. This can occur for small lambda values as the coefficient estimates diverge to $+/-\infty$.

betaMin

If a coefficient estimate falls below betaMin, it is set to zero. This improves the stability and speed of the fitting algorithm.

convNorm

The Lp norm of the coefficient estimate relative changes is computed after each iteration of the inner or outer loop. Convergence of the loop is assessed by comparing this value to the convergence threshold (epsIn or epsOut).

zetaStart

Optional user-specified starting values for the intercept terms (must be non-decreasing). Default is a uniform sequency from -1 to 1.

thetaStart

Optional user-specified starting values for the non-intercept terms. Default is zero for all coefficients.

Value

An object with S3 class "ordinalNetFit".

Details

The ordinal model has the form $$g(P(y\le j|x)) = Intercept[j] + x\beta,$$ where $g(.)$ is a link function, most commonly logit.

The elastic net penalty is defined as $$\lambda{(1-\alpha)/2||\beta||_2^2+\alpha||\beta||_1}.$$

The objective function is $$-1/N*loglik + penalty.$$

Examples

Run this code

set.seed(10)
x <- matrix(rnorm(50*5), ncol=5)
beta <- c(1, 0, 0, 0, 0)
intercepts <- c(-1, 1)
xb <- x %*% beta
eta <- cbind(xb+intercepts[1], xb+intercepts[2])
probMatCumul <- 1 / (1 + exp(-eta))
probMat <- cbind(probMatCumul, 1) - cbind(0, probMatCumul)
y <- apply(probMat, 1, function(p) sample(1:3, size=1, prob=p))
y <- as.factor(y)
fit <- ordinalNet(x, y)
print(fit)
coef(fit)
predict(fit, type="class")
predict(fit, type="prob")

Run the code above in your browser using DataLab