Cross categorical variables with numeric variables, and get a table of means and standard deviations.
tab_num(
data,
row_var,
col_vars,
tab_vars,
wt,
diff = "tot",
ci = NULL,
conf_level = 0.95,
comp = c("tab", "all"),
color = c("auto", "diff", "diff_ci", "after_ci"),
digits = 0,
na = c("keep", "drop", "drop_fct", "drop_num"),
totaltab = "line",
totaltab_name = "Ensemble",
tot = NULL,
total_names = "Total",
subtext = "",
num = FALSE,
df = FALSE
)
A data frame.
The row variable, which will be printed with one level per line. If numeric, it will be used as a factor.
The numeric variables, which will appear in columns :
means and standard deviation are calculated for each levels of row_var
and tab_vars
.
<tidy-select> Tab variables : a subtable is made for each combination of levels of the selected variables. Leave empty to make a simple cross-table. All tab variables are converted to factor.
A weight variable, of class numeric. Leave empty for unweighted results.
The reference cell to calculate differences (used to print colors
) :
"tot"
: by default, cells differences from total rows are calculated with
pct = "row"
, and cells differences from total columns with pct = "col"
.
"first"
: calculate cells differences from the first cell
of the row or column (useful to color temporal developments).
"no"
: not use diffs to gain calculation time.
The type of confidence intervals to calculate, passed to tab_ci
(automatically added if needed for color
).
"cell"
: absolute confidence intervals of cells percentages.
"diff"
: confidence intervals of the difference between a cell and the
relative total cell (or relative first cell when diff = "first"
).
"auto"
: ci = "diff"
for means and row/col percentages,
ci = "cell"
for frequencies ("all", "all_tabs").
By default, for percentages, with ci = "cell"
Wilson's method is used,
and with ci = "diff"
Wald's method along Agresti and Caffo's adjustment.
Means use classic method. This can be changed in tab_ci
.
The confidence level for the confidence intervals, as a single numeric between 0 and 1. Default to 0.95 (95%).
Comparison level. When tab_vars
are present, should the
contributions to variance be calculated for each subtable/group (by default,
comp = "tab"
) ? Should they be calculated for the whole table
(comp = "all"
) ?
comp
must be set once and for all the first time you use tab_plain
,
tab_num
or tab_chi2
with rows, or tab_ci
.
TRUE
print the color percentages and means based on cells differences from
totals or reference cell, as provided by diff
. Default to FALSE
, no colors.
The number of digits to print, as a single integer.
The policy to adopt for missing values in row and tab variables (factors), as a single string.
"keep"
: by default, NA
's of row and tab variables
are printed as an explicit "NA"
level.
"drop"
: remove NA
's in row and tab variables.
NA
s in numeric variables are always removed when calculating means. For that reason
the n
field of each resulting fmt
column, used to calculate confidence
intervals, only takes into account the complete observations (without NA
).
To drop all rows with NA
in any numeric variable first, use tab_prepare
or tab_many
with the na_drop_all
argument.
The total table,
if there are subtables/groups (i.e. when tab_vars
is provided) :
"line"
: by default, add a general total line (necessary for
calculations with comp = "all"
)
"table"
: add a complete total table
(i.e. row_var
by col_vars
without tab_vars
).
"no"
: not to draw any total table.
The name of the total table, as a single string.
The totals :
c("col", "row")
or "both"
: by default, both total rows and total
columns.
"row"
: only total rows.
"col"
: only total column.
"no"
: remove all totals (after calculations if needed).
The names of the totals, as a character vector of length one or two.
Use syntax of type c("Total row", "Total column")
to set different names for
rows and cols.
A character vector to print rows of legend under the table.
Set to TRUE
to obtain a table with normal numeric vectors (not fmt
).
Set to TRUE
to obtain a plain data.frame (not a tibble
),
with normal numeric vectors (not fmt
). Useful, for example, to pass the table to
correspondence analysis with FactoMineR.
A tibble
of class tabxplor_tab
. If ...
(tab_vars
)
are provided, a tab
of class tabxplor_grouped_tab
.
All non-text columns are fmt
vectors of class tabxplor_fmt
,
storing all the data necessary to print formats and colors. Columns with row_var
and tab_vars
are of class factor
: every added factor
will be
considered as a tab_vars
and used for grouping. To add text columns without
using them in calculations, be sure they are of class character
.
# NOT RUN {
data <- dplyr::storms %>% tab_prepare(category, wind, na_drop_all = wind)
tab_num(data, category, wind, tot = "row", color = "after_ci")
# }
Run the code above in your browser using DataLab