redundancy: Calculate Observed and Expected Redundancy of Sequences

Description

This function calculates the observed redundancy of sequences and compares it to expected redundancy values obtained from shuffled sequences. The redundancy is defined as the proportion of consecutive identical elements in the sequences.

Usage

redundancy(sequences)

Value

A data frame with the following columns:

redundancy: The observed redundancy in the original sequences. This is the mean proportion of consecutive identical elements across all sequences.
redundancy_expected_across: The expected redundancy obtained from sequences where elements have been shuffled across the sequences.
redundancy_expected_within: The expected redundancy obtained from sequences where elements have been shuffled within each sequence.

Arguments

sequences: A vector of character strings, where each string represents a sequence of elements separated by spaces.

Details

The function calculates redundancy as the proportion of consecutive identical elements within each sequence. It then compares this observed redundancy to expected values derived from sequences where elements are shuffled either across sequences or within each sequence. The function relies on auxiliary functions `shuffle_sequences_across` and `shuffle_sequences_within` for generating the shuffled sequences.

Examples

Run this code

# Example sequences
sequences <- c("A A B C C", "B A A C C", "A B C C C")
# Compute redundancy
redundancy(sequences)

Run the code above in your browser using DataLab