Learn R Programming

bioseq: A Toolbox for Manipulating Biological Sequences in R

The purpose of bioseq is to provide a collection of classes and functions for biological sequence manipulation in R. The package provides simple S3 classes for sequences that are suitable for inclusion in a data frame and that can be analysed using the dplyr grammar and other tidyverse tools.

Installation

You can install the development version of bioseq from GitHub with:

remotes::install_github("fkeck/bioseq")

Tutorials

There are two vignettes available to get started with the package:

  • Introduction to the bioseq package
  • Cleaning and exploring NCBI data with the bioseq package

Citation

If you use bioseq please cite Keck F. (2020) Handling biological sequences in R with the bioseq package. Methods in Ecology and Evolution. doi:10.1111/2041-210X.13490

Copy Link

Version

Install

install.packages('bioseq')

Monthly Downloads

505

Version

0.1.4

License

GPL-3

Maintainer

Francois Keck

Last Published

September 6th, 2022

Functions in bioseq (0.1.4)

as_seqinr_alignment

Coerce to seqinr alignment
bioseq-package

bioseq: A Toolbox for Manipulating Biological Sequences
as_dna

Coercion to DNA vector
new_rna

RNA vector constructor
vec_ptype2.bioseq_aa

Internal
is_aa

Test if the object is an amino acid vector
new_aa

Amino acid (AA) vector constructor
new_dna

DNA vector constructor
pillar_shaft.bioseq_aa

Internal formatting
read_fasta

Read sequences in FASTA format
rev_complement

Reverse and complement sequences
pillar_shaft.bioseq_rna

Internal formatting
dna

Build a DNA vector
dic_genetic_codes

Genetic code tables
is_rna

Test if the object is a RNA vector
is_dna

Test if the object is a DNA vector
pillar_shaft.bioseq_dna

Internal formatting
seq-replace

Replace matched patterns in sequences
as_rna

Coercion to RNA vector
seq_crop_position

Crop sequences between two positions
seq_detect_pattern

Detect the presence of patterns in sequences
seq_count_pattern

Count the number of matches in sequences
seq_crop_pattern

Crop sequences using delimiting patterns
seq_extract_pattern

Extract matching patterns from sequences
seq_cluster

Cluster sequences by similarity
seq_disambiguate_IUPAC

Disambiguate biological sequences
validate_seq

Sequence validator
write_fasta

Write sequences in FASTA format
seq_combine

Combine multiple sequences
seq_extract_position

Extract a region between two positions in sequences
seq_nchar

Count the number of character in sequences
seq_consensus

Find a consensus sequence for a set of sequences.
seq_translate

Translate DNA/RNA sequences into amino acids
seq_remove_position

Remove a region between two positions in sequences.
seq_replace_position

Replace a region between two positions in sequences
transcription

Transcribe DNA, reverse-transcribe RNA
rna

Build a RNA vector
seaview

SeaView: DNA sequences and phylogenetic tree viewer
seq_nseq

Number of sequences in a vector
seq_remove_pattern

Remove matched patterns in sequences
seq_stat_gc

Compute G+C content
seq_stat_prop

Compute proportions for characters
seq_spellout

Spell out sequences
seq_rev_translate

Reverse translate amino acid sequences
seq_split_kmer

Split sequences into k-mers
seq_split_pattern

Split sequences
aliview

AliView: DNA sequences viewer
as-tibble-ape

Convert DNAbin/AAbin to tibble
as_AAbin

Coerce to AAbin
as_DNAbin

Coerce to DNAbin
aa

Build an amino acid (AA) vector
as_aa

Coercion to an amino acid (AA) vector
as_DNAbin.tbl_df

Coerce tibble to DNAbin
alphabets

Biological alphabets
as-tibble-bioseq

Convert bioseq DNA, RNA and AA to tibble
as_AAbin.tbl_df

Coerce tibble to AAbin
genetic-codes

Genetic code tables
fragilaria

DNA sequences (rbcL) for various Fragilaria