Learn R Programming

alakazam (version 0.2.2)

maskSeqEnds: Masks ragged leading and trailing edges of aligned DNA sequences

Description

maskSeqEnds takes a vector of DNA sequences, as character strings, and replaces the leading and trailing characters with "N" characters to create a sequence vector with uniformly masked outer sequence segments.

Usage

maskSeqEnds(seq, max_mask = NULL, trim = FALSE)

Arguments

seq
a character vector of DNA sequence strings.
max_mask
the maximum number of characters to mask. If set to 0 then no masking will be performed. If set to NULL then the upper masking bound will be automatically determined from the maximum number of observed leading or trailing "N" c
trim
if TRUE leading and trailing characters will be cut rather than masked with "N" characters.

Value

  • A modified seq vector with masked (or optionally trimmed) sequences.

See Also

Other sequence manipulation functions: collapseDuplicates, maskSeqGaps

Examples

Run this code
# Default behavior uniformly masks ragged ends
seq <- c("CCCCTGGG", "NAACTGGN", "NNNCTGNN")
maskSeqEnds(seq)

# Does nothing
maskSeqEnds(seq, max_mask=0)

# Cut ragged sequence ends
maskSeqEnds(seq, trim=TRUE)

# Set max_mask to limit extent of masking and trimming
maskSeqEnds(seq, max_mask=1)
maskSeqEnds(seq, max_mask=1, trim=TRUE)

Run the code above in your browser using DataLab