Learn R Programming

bioseq (version 0.1.4)

validate_seq: Sequence validator

Description

Validate character strings before sequence construction.

Usage

validate_seq(x, alphabet, invalid_replacement, type = "DNA")

Value

A character vector.

Arguments

x

a character vector.

alphabet

a character vector defining the sequence alphabet;

invalid_replacement

a character to replace non valid characters

type

type of sequence ("DNA", "RNA", "AA"). It is only used to provide more informative warning messages.

Details

Validation steps:

  1. Check that x is a character vector, fails if not.

  2. Force alpha characters to uppercase

  3. Delete blank characters (spaces and tabs)

  4. Delete line breaks

  5. Converts . (dots) to - (as both can represent a gap)

  6. Replace invalid characters with N/X (with a warning).