Computes Somers' Dxy rank correlation between a variable x and a
binary (0-1) variable y, and the corresponding receiver operating
characteristic curve area c. Note that Dxy = 2(c-0.5).
somers allows for a weights variable, which specifies frequencies
to associate with each observation.
Usage
somers2(x, y, weights=NULL, normwt=FALSE, na.rm=TRUE)
Arguments
x
typically a predictor variable. NAs are allowed.
y
a numeric outcome variable coded 0-1. NAs are allowed.
weights
a numeric vector of observation weights (usually frequencies). Omit
or specify a zero-length vector to do an unweighted analysis.
normwt
set to TRUE to make weights sum to the actual number of non-missing
observations.
na.rm
set to FALSE to suppress checking for NAs.
Value
a vector with the named elements C, Dxy, n (number of non-missing
pairs), and Missing. Uses the formula
C = (mean(rank(x)[y == 1]) - (n1 + 1)/2)/(n - n1), where n1 is the
frequency of y=1.
concept
logistic regression model
predictive accuracy
Details
The rcorr.cens function, which although slower than somers2 for large
sample sizes, can also be used to obtain Dxy for non-censored binary
y, and it has the advantage of computing the standard deviation of
the correlation index.