get_num_label_repetitions: Get the number of sites have at least k trials of each label level
Description
Calculates number of sites that have at least k label level
repetitions for all values k. This information is useful for assessing how to
set the number of cross-validation splits (and repeats of labels per
cross-validation split) to use in a datasource. One can also assess the
number of label level repetitions separately conditioned on another site_info
variable. For example, if one has recordings from different brain regions,
and the brain region information is contained in a site_info variable, then
one could calculate how many sites have at least k repetitions for each
stimulus in each brain region.
A data frame with the class label_repetition which allows the
results to be plotted. The returned data frame has a row for each label
level, and columns with sequential integer values k = 0, 1, ... The values
in the data frame show the number of sites that have at least k repetitions
of a given stimulus.
Arguments
binned_data
A string that list a path to a file that has data in
binned format, or a data frame of binned_data that is in binned format.
labels
A string specifying which label variable should be
used for calculating the minimum number of level repetitions.
site_info_grouping_name
A character string that specifies if the
number of sites that have k repetitions should be computed separately
based on the levels of a site_info variable.
label_levels
A character vector specifying which levels to include.
If not set, all levels will be used.