biVar is a generic function that accepts a formula and usual
na.action parameters plus a
statinfo that specifies a function of two variables to
compute along with information about labeling results for printing and
plotting. The function is called separately with each right hand side
variable and the same left hand variable. The result is a matrix of
bivariate statistics and the
statinfo list that drives printing
and plotting. The plot method draws a dot plot with x-axis values by
default sorted in order of one of the statistics computed by the function.
spearman2 computes the square of Spearman's rho rank correlation
and a generalization of it in which
x can relate
y. This is done by computing the Spearman
multiple rho-squared between
(rank(x), rank(x)^2) and
x is categorical, a different kind of Spearman correlation
used in the Kruskal-Wallis test is computed (and
spearman2 can do
the Kruskal-Wallis test). This is done by computing the ordinary
k-1 dummy variables and
also be a formula, in which case each predictor is correlated separately
y, using non-missing observations for that predictor.
biVar is used to do the looping and bookkeeping. By default the
plot shows the adjusted
rho^2, using the same formula used for
the ordinary adjusted
F test uses the unadjusted
spearman computes Spearman's rho on non-missing values of two
spearman.test is a simple version of
chiSquare is set up like
spearman2 except it is intended
for a categorical response variable. Separate Pearson chi-square tests
are done for each predictor, with optional collapsing of infrequent
categories. Numeric predictors having more than
g levels are
g quantile groups.
biVar(formula, statinfo, data=NULL, subset=NULL, na.action=na.retain, exclude.imputed=TRUE, ...)
# S3 method for biVar print(x, ...)
# S3 method for biVar plot(x, what=info$defaultwhat, sort.=TRUE, main, xlab, vnames=c('names','labels'), ...)
# S3 method for default spearman2(x, y, p=1, minlev=0, na.rm=TRUE, exclude.imputed=na.rm, ...)
# S3 method for formula spearman2(formula, data=NULL, subset, na.action=na.retain, exclude.imputed=TRUE, ...)
spearman.test(x, y, p=1)
chiSquare(formula, data=NULL, subset=NULL, na.action=na.retain, exclude.imputed=TRUE, ...)
function that is called for a single
x, i.e., when there is no
formula) returns a vector of statistics for the variable.
chiSquare return a
matrix with rows corresponding to predictors.
a formula with a single left side variable
the usual options for models. Default for
na.action is to retain
all values, NA or not, so that NAs can be deleted in only a pairwise
FALSE to include imputed values (created by
impute) in the calculations.
other arguments that are passed to the function used to
compute the bivariate statistics or to
logical; delete NA values?
a numeric matrix with at least 5 rows and at least 2 columns (if
y is absent). For
spearman2, the first argument may
be a vector of any type, including character or factor. The first
argument may also be a formula, in which case all predictors are
correlated individually with
the response variable.
x may be a formula for
in which case
spearman2.formula is invoked. Each
predictor in the right hand side of the formula is separately correlated
with the response variable. For
is an object produced by
x is a numeric vector, as is
x is a formula.
a numeric vector
for numeric variables, specifies the order of the Spearman
use. The default is
p=1 to compute the ordinary
p=2 to compute the quadratic rank
generalization to allow non-monotonicity.
p is ignored for
minimum relative frequency that a level of a categorical predictor
should have before it is pooled with other categories (see
which case it also applies to the response). The default,
minlev=0 causes no pooling.
specifies which statistic to plot. Possibilities include the column names that appear with the print method is used.
sort.=FALSE to suppress sorting variables by the
statistic being plotted
main title for plot. Default title shows the name of the response variable.
x-axis label. Default constructed from
"labels" to use variable labels in place of names for
plotting. If a variable does not have a label the name is always
Department of Biostatistics
Uses midranks in case of ties, as described by Hollander and Wolfe.
P-values for Spearman, Wilcoxon, or Kruskal-Wallis tests are
approximated by using the
Hollander M. and Wolfe D.A. (1973). Nonparametric Statistical Methods. New York: Wiley.
Press WH, Flannery BP, Teukolsky SA, Vetterling, WT (1988): Numerical Recipes in C. Cambridge: Cambridge University Press.