Learn R Programming

bioseq (version 0.1.1)

seq_crop_pattern: Crop sequences using delimiting patterns

Description

Crop sequences using delimiting patterns

Usage

seq_crop_pattern(x, pattern_in, pattern_out)

Arguments

x

a DNA, RNA or AA vector to be cropped.

pattern_in

patterns defining the beginning (left-side).

pattern_out

patterns defining the end (right-side).

Value

A cropped DNA, RNA or AA vector.

Patterns

It is important to understand how patterns are treated in bioseq.

Patterns are recycled along the sequences (usually the x argument). This means that if a pattern (vector or list) is of length > 1, it will be replicated until it is the same length as x. The reverse is not true and a vector of patterns longer than a vector of sequences will raise a warning.

Patterns can be DNA, RNA or AA vectors (but they must be from the same class as the sequences they are matched against). If patterns are DNA, RNA or AA vectors, they are disambiguated prior to matching. For example pattern dna("ARG") will match AAG or AGG.

Alternatively, patterns can be a simple character vector containing regular expressions.

Vectors of patterns (DNA, RNA, AA or regex) can also be provided in a list. In that case, each vector of the list will be collapsed prior matching, which means that each vector element will be used as an alternative pattern. For example pattern list(c("AAA", "CCC"), "GG") will match AAA or CCC in the first sequence, GG in the second sequence, AAA or CCC in the third, and so on following the recycling rule.

See Also

stri_extract from stringi and str_extract from stringr for the underlying implementation.

Other string operations: seq-replace, seq_combine, seq_count_pattern, seq_crop_position, seq_detect_pattern, seq_extract_pattern, seq_extract_position, seq_remove_pattern, seq_remove_position, seq_replace_position, seq_split_kmer, seq_split_pattern

Examples

Run this code
# NOT RUN {
x <- dna("ACGTTAAAAAGTGTAGCCCCCGT", "CTCGAAATGA")
seq_crop_pattern(x, pattern_in = "AAAA", pattern_out = "CCCC")
# }

Run the code above in your browser using DataLab