This function describes a distribution by a set of indices (e.g., measures of centrality, dispersion, range, skewness, kurtosis).
describe_distribution(x, ...)# S3 method for list
describe_distribution(
x,
centrality = "mean",
dispersion = TRUE,
iqr = TRUE,
range = TRUE,
quartiles = FALSE,
ci = NULL,
include_factors = FALSE,
iterations = 100,
threshold = 0.1,
verbose = TRUE,
...
)
# S3 method for numeric
describe_distribution(
x,
centrality = "mean",
dispersion = TRUE,
iqr = TRUE,
range = TRUE,
quartiles = FALSE,
ci = NULL,
iterations = 100,
threshold = 0.1,
verbose = TRUE,
...
)
# S3 method for factor
describe_distribution(x, dispersion = TRUE, range = TRUE, verbose = TRUE, ...)
# S3 method for character
describe_distribution(x, dispersion = TRUE, range = TRUE, verbose = TRUE, ...)
# S3 method for data.frame
describe_distribution(
x,
centrality = "mean",
dispersion = TRUE,
iqr = TRUE,
range = TRUE,
quartiles = FALSE,
include_factors = FALSE,
ci = NULL,
iterations = 100,
threshold = 0.1,
verbose = TRUE,
...
)
A numeric vector, a character vector, a dataframe, or a list. See
Details
.
Additional arguments to be passed to or from methods.
The point-estimates (centrality indices) to compute. Character (vector) or list with one or more of these options: "median"
, "mean"
, "MAP"
or "all"
.
Logical, if TRUE
, computes indices of dispersion related to the estimate(s) (SD
and MAD
for mean
and median
, respectively).
Logical, if TRUE
, the interquartile range is calculated
(based on stats::IQR()
, using type = 6
).
Return the range (min and max).
Return the first and third quartiles (25th and 75pth percentiles).
Confidence Interval (CI) level. Default is NULL
, i.e. no
confidence intervals are computed. If not NULL
, confidence intervals
are based on bootstrap replicates (see iterations
). If
centrality = "all"
, the bootstrapped confidence interval refers to
the first centrality index (which is typically the median).
Logical, if TRUE
, factors are included in the
output, however, only columns for range (first and last factor levels) as
well as n and missing will contain information.
The number of bootstrap replicates for computing confidence
intervals. Only applies when ci
is not NULL
.
For centrality = "trimmed"
(i.e. trimmed mean), indicates the fraction (0 to 0.5) of observations to be trimmed from each end of the vector before the mean is computed.
Toggle warnings and messages.
A data frame with columns that describe the properties of the variables.
If x
is a dataframe, only numeric variables are kept and will be displayed in the summary.
If x
is a list, the behavior is different whether x
is a stored list. If
x
is stored (for example, describe_distribution(mylist)
where mylist
was created before), artificial variable names are used in the summary
(Var_1
, Var_2
, etc.). If x
is an unstored list (for example,
describe_distribution(list(mtcars$mpg))
), then "mtcars$mpg"
is used as
variable name.
# NOT RUN {
describe_distribution(rnorm(100))
data(iris)
describe_distribution(iris)
describe_distribution(iris, include_factors = TRUE, quartiles = TRUE)
describe_distribution(list(mtcars$mpg, mtcars$cyl))
# }
Run the code above in your browser using DataLab