Last chance! 50% off unlimited learning
Sale ends in
Remove/replace gaps or any irregular characters from protein sequences, to make them suitable for feature extraction or sequence alignment based similarity computation.
removeGaps(x, pattern = "-", replacement = "", ...)
a vector of protein sequence(s) with gaps or irregular characters removed/replaced.
character vector, containing the input protein sequence(s).
character string contains the gap (or other irregular)
character to be removed or replaced. Default is "-"
.
For advanced usage, see gsub
.
a replacement for matched characters.
Default is ""
(remove the matched character).
addtional parameters for gsub
.
Nan Xiao <https://nanx.me>
# amino acid sequences that contain gaps ("-")
aaseq <- list(
"MHGDTPTLHEYMLDLQPETTDLYCYEQLSDSSE-EEDEIDGPAGQAEPDRAHYNIVTFCCKCDSTLRLCVQS",
"MHGDTPTLHEYMLDLQPETTDLYCYEQLNDSSE-EEDEIDGPAGQAEPDRAHYNIVTFCCKCDSTLRLCVQS"
)
if (FALSE) {
#' # gaps create issues for alignment
parSeqSim(aaseq)
# remove the gaps
nogapseq <- removeGaps(aaseq)
parSeqSim(nogapseq)
}
Run the code above in your browser using DataLab