Learn R Programming

BioMedR (version 1.2.1)

calcParProtSeqSim: Parallellized Protein Sequence Similarity Calculation based on Sequence Alignment

Description

Parallellized Protein Sequence Similarity Calculation based on Sequence Alignment

Usage

calcParProtSeqSim(protlist, type = "local", submat = "BLOSUM62")

Arguments

protlist

A length n list containing n protein sequences, each component of the list is a character string, storing one protein sequence. Unknown sequences should be represented as ''.

type

Type of alignment, default is 'local', could be 'global' or 'local', where 'global' represents Needleman-Wunsch global alignment; 'local' represents Smith-Waterman local alignment.

submat

Substitution matrix, default is 'BLOSUM62', could be one of 'BLOSUM45', 'BLOSUM50', 'BLOSUM62', 'BLOSUM80', 'BLOSUM100', 'PAM30', 'PAM40', 'PAM70', 'PAM120', 'PAM250'.

Value

A n x n similarity matrix.

Details

This function implemented the parallellized version for calculating protein sequence similarity based on sequence alignment.

See Also

See calcTwoProtSeqSim for protein sequence alignment for two protein sequences. See calcParProtGOSim for protein similarity calculation based on Gene Ontology (GO) semantic similarity.

Examples

Run this code
# NOT RUN {
s1 = readFASTA(system.file('protseq/P00750.fasta', package = 'BioMedR'))[[1]]
s2 = readFASTA(system.file('protseq/P08218.fasta', package = 'BioMedR'))[[1]]
s3 = readFASTA(system.file('protseq/P10323.fasta', package = 'BioMedR'))[[1]]
s4 = readFASTA(system.file('protseq/P20160.fasta', package = 'BioMedR'))[[1]]
s5 = readFASTA(system.file('protseq/Q9NZP8.fasta', package = 'BioMedR'))[[1]]
plist = list(s1, s2, s3, s4, s5)
psimmat = calcParProtSeqSim(plist, type = 'local', submat = 'BLOSUM62')

# }

Run the code above in your browser using DataLab