group_var(x, groupsize = 5, as.num = TRUE, right.interval = FALSE, groupcount = 30)
group_labels(x, groupsize = 5, right.interval = FALSE, groupcount = 30)
x
a new group is defined, i.e. groupsize=5
.
Use groupsize = "auto"
to automatically resize a variable into
a maximum of 30 groups (which is the ggplot-default grouping when
plotting histograms). Use groupcount
to determine the amount
of groups.TRUE
, the recoded variable will
be returned as numeric vector. If FALSE
, a factor is returned.TRUE
, grouping starts with the lower
bound of groupsize
. See 'Details'.groupsize="auto"
). Default is 30. If groupsize
is not set to "auto"
,
this argument will be ignored.group_var
, a grouped variable, either as numeric or as factor (see paramter as.num
).
group_label
, a string vector or a list of string vectors containing labels based on the grouped categories of x
, formatted as "from lower bound to upper bound", e.g. "10-19" "20-29" "30-39"
etc. See 'Examples'.
groupsize
is set to a specific value, the variable is recoded
into several groups, where each group has a maximum range of groupsize
.
Hence, the amount of groups differ depending on the range of x
.
If groupsize = "auto"
, the variable is recoded into a maximum of
groupcount
groups. Hence, independent from the range of
x
, always the same amount of groups are created, so the range
within each group differs (depending on x
's range).
right.interval
determins which boundary values to include when
grouping is done. If TRUE
, grouping starts with the lower
bound of groupsize
. For example, having a variable ranging from
50 to 80, groups cover the ranges from 50-54, 55-59, 60-64 etc.
If FALSE
(default), grouping starts with the upper bound
of groupsize
. In this case, groups cover the ranges from
46-50, 51-55, 56-60, 61-65 etc. Note: This will cover
a range from 46-50 as first group, even if values from 46 to 49
are not present. See 'Examples'.
If you want to split a variable into a certain amount of equal
sized groups (instead of having groups where values have all the same
range), use the split_var
function!
split_var
to split variables into
equal sized groups, group_str
for grouping string vectors
or rec_pattern
and rec
for another
convenbient way of recoding variables into smaller groups.
age <- abs(round(rnorm(100, 65, 20)))
age.grp <- group_var(age, 10)
hist(age)
hist(age.grp)
age.grpvar <- group_labels(age, 10)
table(age.grp)
print(age.grpvar)
# histogram with EUROFAMCARE sample dataset
# variable not grouped
data(efc)
hist(efc$e17age, main = get_label(efc$e17age))
# bar plot with EUROFAMCARE sample dataset
# grouped variable
ageGrp <- group_var(efc$e17age)
ageGrpLab <- group_labels(efc$e17age)
barplot(table(ageGrp), main = get_label(efc$e17age), names.arg = ageGrpLab)
# within a pipe-chain
library(dplyr)
efc %>% select(e17age, c12hour, c160age) %>% group_var(groupsize = 20)
# create vector with values from 50 to 80
dummy <- round(runif(200, 50, 80))
# labels with grouping starting at lower bound
group_labels(dummy)
# labels with grouping startint at upper bound
group_labels(dummy, right.interval = TRUE)
Run the code above in your browser using DataLab