A dataset containing DNA sequences from test bacteria with detailed annotation metadata. The first column combines multiple annotation elements separated by semicolons.
seq_dataA data frame with multiple rows and 2 variables:
Character. Combined annotation fields separated by semicolons, containing:
ID: Sequence identifier (e.g., "1_7")
partial: Completion status ("00" for complete, "01" for partial)
start_type: Translation initiation codon (e.g., "GTG", "ATG")
rbs_motif: Ribosome binding site motif (e.g., "GGAG/GAGG")
rbs_spacer: RBS spacer length (e.g., "5-10bp")
gc_cont: GC content (e.g., "0.673")
Character. DNA sequence (when available) in FASTA format