Finds the numeric variables, and ignores the others. (See
summarizeFactors
for a function that handles non-numeric
variables). It will provide quantiles (which ones are specified by
probs
as well as other summary statistics, as specified by
stats
. Results are returned in a data frame. The main
benefits from this compared to R's default summary are 1) more
summary information is returned for each variable (dispersion), 2)
the results are returned in a form that is easy to use in further
analysis, 3) the columns in the output may be alphabetized.
summarizeNumerics(dat, alphaSort = FALSE, probs = c(0, 0.5, 1),
stats = TRUE, na.rm = TRUE, unbiased = TRUE)
a data frame or a matrix
If TRUE, the columns are re-organized in alphabetical order. If FALSE, they are presented in the original order.
Controls calculation of quantiles. If FALSE, no
quantile estimates are provided. If TRUE, the quantile
function is called with probs = c(0, 0.5, 1.0)
,
corresponding to labels which will appear in output,
c("min", "med", "max")
. Users may specify any
vector of real values in [0,1]. In output, however, labels
will be c("min", "med", "max")
or "pctile_dd
clarity.
Can be TRUE/FALSE or a vector of desired summary stats. The full set of allowed labels is c("mean", "sd", "var", "skewness", "kurtosis", "nobs", "nmiss"). If TRUE (default), result includes everything except variance. I.e., TRUE is same as c("mean", "sd", "skewness", "kurtosis", "nobs", "nmiss"). If FALSE, provide none of these. "nobs" means number of observations with non-missing, finite scores (not NA, NaN, -Inf, or Inf). "nmiss" is the number of cases with values of NA.
default TRUE. Should missing data be removed?
If TRUE (default), skewness and kurtosis are calculated with biased corrected (N-1) divisor in the standard devation.
a data.frame with one row per summary element and the rows representing the variables.
summarize and summarizeFactors