This function crosstabulates the frequencies of one categorical variable
within the groups of another. The results are sorted on the values of the
variable whose distribution is shown in each column i.e. the variable
specified with row_cat. If this variable is a character vector it
will be sorted alphabetically. If it is a factor it will be sorted in the
order of its levels.
cat_compare(
data,
row_cat,
col_cat,
na.rm.row = FALSE,
na.rm.col = FALSE,
na.rm = NULL,
only = "",
clean_names = getOption("tabbycat.clean_names"),
na_label = getOption("tabbycat.na_label")
)A tibble showing the distribution of row_cat within each
group in col_cat.
A dataframe containing the two variables of interest.
The column name of a categorical variable whose distribution
will be calculated for each group in col_cat.
The column name of a categorical variable which will be
split into groups and the distrubtion of row_cat calulated
for each group.
A boolean indicating whether to exclude NAs from the row results. The default is FALSE.
A boolean indicating whether to exclude NAs from the column results. The default is FALSE.
A boolean indicating whether to exclude NAs from both row and
column results. This argument is provided as a convenience. It allows you
to set na.rm.row and na.rm.col to the same value without
having to specify them separately. If the value of na.rm is NULL,
the argument is ignored. If it is not NULL it takes precendence.
default is NULL.
A string indicating that only one set of frequency columns
should be returned in the results. If only is either "n" or
"number", only the number columns are returned. If only is either
"p" or "percent", only the percent columns are returned. If only is
any other value, both sets of columns are shown. The default value is an
empty string, which means both sets of columns are shown.
A boolean indicating whether the column names of the
results tibble should be cleaned, so that any column names produced from
data are converted to snake_case. The default is TRUE, but this can be
changed with options(tabbycat.clean_names = FALSE).
A string indicating the label to use for the columns that contain data for missing values. The default value is "na", but use this argument to set a different value if the default value collides with data in your dataset.