Learn R Programming

tidysq (version 1.2.3)

remove_na: Remove sequences that contain NA values

Description

This function replaces sequences with NA values by empty (NULL) sequences or removes NA values from sequences in an sq object.

Usage

remove_na(x, by_letter = FALSE, ...)

# S3 method for sq remove_na(x, by_letter = FALSE, ..., NA_letter = getOption("tidysq_NA_letter"))

Value

An sq object with the same type as the input type. Sequences that do not contain any NA values are left unchanged.

Arguments

x

[sq]
An object this function is applied to.

by_letter

[logical(1)]
If FALSE, filter condition is applied to sequence as a whole. If TRUE, each letter is applied filter to separately.

...

further arguments to be passed from or to other methods.

NA_letter

[character(1)]
A string that is used to interpret and display NA value in the context of sq class. Default value equals to "!".

Details

NA may be introduced as a result of using functions like substitute_letters or bite. They can also appear in sequences if the user reads FASTA file using read_fasta or constructs sq object from character vector with sq function without safe_mode turned on - and there are letters in file or strings other than specified in the alphabet.

remove_na() is used to filter out sequences or elements that have NA value(s). By default, if any letter in a sequence is NA, then whole sequence is replaced by empty (NULL) sequence. However, if by_letter parameter is set to TRUE, then sequences are only shortened by excluding NA values.

See Also

sq

Functions that clean sequences: is_empty_sq(), remove_ambiguous()

Examples

Run this code
# Creating objects to work on:
sq_ami <- sq(c("MIAANYTWIL","TIAALGNIIYRAIE", "NYERTGHLI", "MAYXXXIALN"),
             alphabet = "ami_ext")
sq_dna <- sq(c("ATGCAGGA", "GACCGAACGAN", "TGACGAGCTTA", "ACTNNAGCN"),
             alphabet = "dna_ext")

# Substituting some letters with NA
sq_ami_sub <- substitute_letters(sq_ami, c(E = NA_character_, R = NA_character_))
sq_dna_sub <- substitute_letters(sq_dna, c(N = NA_character_))

# Biting sequences out of range
sq_bitten <- bite(sq_ami, 1:15)

# Printing the sequences
sq_ami_sub
sq_dna_sub

# Removing sequences containing NA
remove_na(sq_ami_sub)
remove_na(sq_dna_sub)
remove_na(sq_bitten)

# Removing only NA elements
remove_na(sq_ami_sub, by_letter = TRUE)
remove_na(sq_dna_sub, TRUE)
remove_na(sq_bitten, TRUE)

Run the code above in your browser using DataLab