disambiguate: Disambiguate a Nucleic Sequence

Description

Make a DNA/RNA sequence unambiguous by stripping out all symbols that do not uniquely specify nucleic acids. In other words, remove all symbols other than a's, c's, g's, t's or u's from the sequence.

Usage

# S3 method for default
disambiguate(x, case=c("lower", "upper", "as is"), ...)
# S3 method for SeqFastadna
disambiguate(x, ...)
# S3 method for list
disambiguate(x, ...)

Value

According to the input x, a character vector, SeqFastadna object or list containing the completely unambiguous sequence(s) in x.

Arguments

x: A character vector, an object that can be coersed to a character vector or a list of objects that canbe be converted to character vectors. this argument can also be a SeqFastadna object provided by the seqinr package.
case: Determines how symbols in x should be treated before translating them into their complements. “lower”, the default behaviour, converts all symbols to lowercase while “upper” converts them to uppercase. “as is” allows the symbols to pass unchanged so that the case of each output symbol matches that of the corresponding input symbol.
...: Arguments to be passed from or to other functions.

Author

Andrew Hart and Servet Martínez

Details

If x is a SeqFastadna object or a character vector in which each element is a single nucleobase, then it represents a single sequence. It will be made unambiguous and returned in the same form.

On the other hand, if x is a vector of character strings, each of which represents a nucleic sequence, then the result will bea a character vector in which each element contains the unambiguous sequence corresponding to the element in x as a character string.