Learn R Programming

rms (version 3.0-0)

gIndex: Calculate Total and Partial g-indexes for an rms Fit

Description

gIndex computes the total $g$-index for a model based on the vector of linear predictors, and the partial $g$-index for each predictor in a model. The latter is computed by summing all the terms involving each variable, weighted by their regression coefficients, then computing Gini's mean difference on this sum. For example, a regression model having age and sex and age*sex on the right hand side, with corresponding regression coefficients $b_{1}, b_{2}, b_{3}$ will have the $g$-index for age computed from Gini's mean difference on the product of age $\times (b_{1} + b_{3}w)$ where $w$ is an indicator set to one for observations with sex not equal to the reference value. When there are nonlinear terms associated with a predictor, these terms will also be combined.

A print method is defined, and there is a plot method for displaying $g$-indexes using a dot chart.

A basic function GiniMD computes Gini's mean difference on a numeric vector. This index is defined as the mean absolute difference between any two distinct elements of a vector. For a Bernoulli (binary) variable with proportion of ones equal to $p$ and sample size $n$, Gini's mean difference is $2\frac{n}{n-1}p(1-p)$. For a trinomial variable (e.g., predicted values for a 3-level categorical predictor using two dummy variables) having (predicted) values $A, B, C$ with corresponding proportions $a, b, c$, Gini's mean difference is $2\frac{n}{n-1}[ab|A-B|+ac|A-C|+bc|B-C|]$

Usage

gIndex(object, partials = TRUE,
lplabel = if (length(object$scale)) object$scale[1] else "X*Beta",
fun,
funlabel = if (missing(fun)) character(0) else deparse(substitute(fun)),
postfun = if (length(object$scale) == 2) exp else NULL,
postlabel = if (length(postfun))
 ifelse(missing(postfun), if (length(object$scale) > 1) object$scale[2]
  else "Anti-log", deparse(substitute(postfun))) else character(0), ...)

## S3 method for class 'gIndex': print(x, digits=4, abbrev=FALSE, vnames=c("names","labels"), ...)

## S3 method for class 'gIndex': plot(x, what=c('pre', 'post'), xlab=NULL, pch=16, rm.totals=FALSE, sort=c('descending', 'ascending', 'none'), ...)

GiniMd(x, na.rm=FALSE)

Arguments

object
result of an rms fitting function
partials
set to FALSE to suppress computation of partial $g$s
lplabel
a replacement for default values such as "X*Beta" or "log odds"/
fun
an optional function to transform the linear predictors before computing the total (only) $g$. When this is present, a new component gtrans is added to the attributes of the object resulting from gIndex.
funlabel
a character string label for fun, otherwise taken from the function name itself
postfun
a function to transform $g$ such as exp (anti-log), which is the default for certain models such as the logistic and Cox models
postlabel
a label for postfun
...
For gIndex, passed to predict.rms. Ignored for print. Passed to dotchart2 for plot.
x
an object created by gIndex (for print or plot) or a numeric vector (for GiniMd)
digits
causes rounding to the digits decimal place
abbrev
set to TRUE to abbreviate labels if vname="labels"
vnames
set to "labels" to print predictor labels instead of names
what
set to "post" to plot the transformed $g$-index if there is one (e.g., ratio scale)
xlab
$x$-axis label; constructed by default
pch
plotting character for point
rm.totals
set to TRUE to remove the total $g$-index when plotting
sort
specifies how to sort predictors by $g$-index; default is in descending order going down the dot chart
na.rm
set to TRUE if you suspect there may be NAs in x; these will then be removed. Otherwise an error will result.

Value

  • gIndex returns a matrix of class "gIndex" with auxiliary information stored as attributes, such as variable labels. GiniMd returns a scalar.

Details

For stratification factors in a Cox proportional hazards model, there is no contribution of variation towards computing a partial $g$ except from terms that interact with the stratification variable.

References

David HA (1968): Gini's mean difference rediscovered. Biometrika 55:573--575.

See Also

predict.rms

Examples

Run this code
set.seed(1)
n <- 100
x <- 1:n
w <- factor(sample(c('a','b'), n, TRUE))
u <- factor(sample(c('A','B'), n, TRUE))
y <- .01*x + .2*(w=='b') + .3*(u=='B') + .2*(w=='b' & u=='B') + rnorm(n)/5
dd <- datadist(x,w,u); options(datadist='dd')
f <- ols(y ~ x*w*u, x=TRUE, y=TRUE)
f
anova(f)

zc <- predict(f, type='cterms')

# Test GiniMd against a brute-force solution
gmd <- function(x)
  {
    n <- length(x)
    sum(outer(x, x, function(a, b) abs(a - b)))/n/(n-1)
  }
gmd(zc[, 1])
GiniMd(zc[, 1])
GiniMd(zc[, 2])
GiniMd(zc[, 3])
GiniMd(f$linear.predictors)
g <- gIndex(f)
g
g['Total',]
gIndex(f, partials=FALSE)

z <- c(rep(0,17), rep(1,6))
n <- length(z)
GiniMd(z)
2*mean(z)*(1-mean(z))*n/(n-1)

a <- 12; b <- 13; c <- 7; n <- a + b + c
A <- -.123; B <- -.707; C <- 0.523
xx <- c(rep(A, a), rep(B, b), rep(C, c))
GiniMd(xx)
2*(a*b*abs(A-B) + a*c*abs(A-C) + b*c*abs(B-C))/n/(n-1)

y <- y > .8
f <- lrm(y ~ x * w * u, x=TRUE, y=TRUE)
gIndex(f, fun=plogis, funlabel='Prob[y=1]')
options(datadist=NULL)

Run the code above in your browser using DataLab