This function calculates summary statistics for the lengths of sequences of elements, including mean, standard deviation, median, minimum, and maximum lengths. It also counts the number of distinct elements and compares this to shuffled sequences.
Usage
sequence_length_summary(sequences)
Value
A data frame with the following columns:
mean_seq_elements
The mean length of the sequences.
sd_seq_elements
The standard deviation of the sequence lengths.
median_seq_elements
The median length of the sequences.
min_seq_elements
The minimum length of the sequences.
max_seq_elements
The maximum length of the sequences.
distinct_elements
The number of distinct elements across all sequences.
pvalue_distinct_elements
The p-value comparing the true number of distinct elements to shuffled sequences.
Arguments
sequences
A character vector where each element is a sequence of elements separated by spaces.