Learn R Programming

rockchalk (version 1.8.111)

summarizeNumerics: Extracts numeric variables and presents an summary in a workable format.

Description

Finds the numeric variables, and ignores the others. (See summarizeFactors for a function that handles non-numeric variables). It will provide quantiles (which ones are specified by probs as well as other summary statistics, as specified by stats. Results are returned in a data frame. The main benefits from this compared to R's default summary are 1) more summary information is returned for each variable (dispersion), 2) the results are returned in a form that is easy to use in further analysis, 3) the columns in the output may be alphabetized.

Usage

summarizeNumerics(dat, alphaSort = FALSE, probs = c(0, 0.5, 1),
  stats = TRUE, na.rm = TRUE, unbiased = TRUE)

Arguments

dat

a data frame or a matrix

alphaSort

If TRUE, the columns are re-organized in alphabetical order. If FALSE, they are presented in the original order.

probs

Controls calculation of quantiles. If FALSE, no quantile estimates are provided. If TRUE, the quantile function is called with probs = c(0, 0.5, 1.0), corresponding to labels which will appear in output, c("min", "med", "max"). Users may specify any vector of real values in [0,1]. In output, however, labels will be c("min", "med", "max") or "pctile_dd clarity.

stats

Can be TRUE/FALSE or a vector of desired summary stats. The full set of allowed labels is c("mean", "sd", "var", "skewness", "kurtosis", "nobs", "nmiss"). If TRUE (default), result includes everything except variance. I.e., TRUE is same as c("mean", "sd", "skewness", "kurtosis", "nobs", "nmiss"). If FALSE, provide none of these. "nobs" means number of observations with non-missing, finite scores (not NA, NaN, -Inf, or Inf). "nmiss" is the number of cases with values of NA.

na.rm

default TRUE. Should missing data be removed?

unbiased

If TRUE (default), skewness and kurtosis are calculated with biased corrected (N-1) divisor in the standard devation.

Value

a data.frame with one row per summary element and the rows representing the variables.

See Also

summarize and summarizeFactors