Learn R Programming

BioMedR (version 1.2.1)

parSeqSim: Parallellized Protein/DVA Sequence Similarity Calculation based on Sequence Alignment

Description

Parallellized Protein/DNA Sequence Similarity Calculation based on Sequence Alignment

Usage

parSeqSim(protlist, type = "local", submat = "BLOSUM62")

Arguments

protlist

A length n list containing n protein sequences, each component of the list is a character string, storing one protein sequence. Unknown sequences should be represented as ''.

type

Type of alignment, default is 'local', could be 'global' or 'local', where 'global' represents Needleman-Wunsch global alignment; 'local' represents Smith-Waterman local alignment.

submat

Substitution matrix, default is 'BLOSUM62', could be one of 'BLOSUM45', 'BLOSUM50', 'BLOSUM62', 'BLOSUM80', 'BLOSUM100', 'PAM30', 'PAM40', 'PAM70', 'PAM120', 'PAM250'.

Value

A n x n similarity matrix.

Details

This function implemented the parallellized version for calculating protein/DNA sequence similarity based on sequence alignment.

See Also

See twoSeqSim for protein sequence alignment for two protein/DNA sequences. See parGOSim for protein/DNA similarity calculation based on Gene Ontology (GO) semantic similarity.

Examples

Run this code
# NOT RUN {
# Be careful when testing this since it involves parallelisation
# and might produce unpredictable results in some environments

require(Biostrings)

s1 = readFASTA(system.file('protseq/P00750.fasta', package = 'BioMedR'))[[1]]
s2 = readFASTA(system.file('protseq/P08218.fasta', package = 'BioMedR'))[[1]]
s3 = readFASTA(system.file('protseq/P10323.fasta', package = 'BioMedR'))[[1]]
s4 = readFASTA(system.file('protseq/P20160.fasta', package = 'BioMedR'))[[1]]
s5 = readFASTA(system.file('protseq/Q9NZP8.fasta', package = 'BioMedR'))[[1]]
plist = list(s1, s2, s3, s4, s5)
psimmat = parSeqSim(plist, type = 'local', submat = 'BLOSUM62')
print(psimmat)
s11 = readFASTA(system.file('dnaseq/hs.fasta', package = 'BioMedR'))[[1]]
s21 = readFASTA(system.file('dnaseq/hs.fasta', package = 'BioMedR'))[[2]]
s31 = readFASTA(system.file('dnaseq/hs.fasta', package = 'BioMedR'))[[3]]
s41 = readFASTA(system.file('dnaseq/hs.fasta', package = 'BioMedR'))[[4]]
s51 = readFASTA(system.file('dnaseq/hs.fasta', package = 'BioMedR'))[[5]]
plist1 = list(s11, s21, s31, s41, s51)
psimmat1 = parSeqSim(plist1, type = 'local', submat = 'BLOSUM62')
print(psimmat1)
# }

Run the code above in your browser using DataLab