Learn R Programming

PhyInsight (version 0.1.0)

rmBadStrings_2: Remove Mismatched DNA Sequences

Description

Identify and remove bad DNA sequences within a string set. Sequentially removes mismatches until all sequences align.

Usage

rmBadStrings_2(
  DNAStringSet,
  specimen_dataframe,
  rmOutliers = FALSE,
  max_Z_score = 3
)

Value

A list with two elements: the DNA string set with the mismatched sequences removed (1st element) and the specimen dataframe with data for the mismatched sequences removed (2nd element).

Arguments

DNAStringSet

A DNA string set object.

specimen_dataframe

A dataframe with speciment data created using querySpecData().

rmOutliers

A logical value to state whether to remove DNA distance outlier strings

max_Z_score

A numerical value to change the max Z score when removing outliers.

Examples

Run this code
# remove problem strings from a DNA string set
specdata <- querySpecData("Panthera leo")

specdata <- subset(specdata, markercode == "COI-5P")

DNABin_Leo <- genDNABin(specdata)

DNAStringset_Leo <- genDNAStringSet(DNABin_Leo)

DNAStringSet_Leo_manipulated <- ManipStringSet(DNAStringset_Leo)

StringsAndSpecdataframe <- rmBadStrings_2(

 DNAStringSet = DNAStringSet_Leo_manipulated,
 specimen_dataframe = specdata

)

DNAStringSet_NEW <- StringsAndSpecdataframe[[1]]

tail(DNAStringSet_NEW)

specimen_dataframe_NEW <- StringsAndSpecdataframe[[2]]

tail(specimen_dataframe_NEW$processid)

Run the code above in your browser using DataLab