Computes Somers' Dxy rank correlation between a variable `x`

and a
binary (0-1) variable `y`

, and the corresponding receiver operating
characteristic curve area `c`

. Note that `Dxy = 2(c-0.5)`

.
`somers`

allows for a `weights`

variable, which specifies frequencies
to associate with each observation.

`somers2(x, y, weights=NULL, normwt=FALSE, na.rm=TRUE)`

x

typically a predictor variable. `NA`

s are allowed.

y

a numeric outcome variable coded `0-1`

. `NA`

s are allowed.

weights

a numeric vector of observation weights (usually frequencies). Omit or specify a zero-length vector to do an unweighted analysis.

normwt

set to `TRUE`

to make `weights`

sum to the actual number of non-missing
observations.

na.rm

set to `FALSE`

to suppress checking for NAs.

a vector with the named elements `C`

, `Dxy`

, `n`

(number of non-missing
pairs), and `Missing`

. Uses the formula
`C = (mean(rank(x)[y == 1]) - (n1 + 1)/2)/(n - n1)`

, where `n1`

is the
frequency of `y=1`

.

The `rcorr.cens`

function, which although slower than `somers2`

for large
sample sizes, can also be used to obtain Dxy for non-censored binary
`y`

, and it has the advantage of computing the standard deviation of
the correlation index.

# NOT RUN { set.seed(1) predicted <- runif(200) dead <- sample(0:1, 200, TRUE) roc.area <- somers2(predicted, dead)["C"] # Check weights x <- 1:6 y <- c(0,0,1,0,1,1) f <- c(3,2,2,3,2,1) somers2(x, y) somers2(rep(x, f), rep(y, f)) somers2(x, y, f) # }