extractProtPSSM(seq, start.pos = 1L, end.pos = nchar(seq), psiblast.path = NULL, makeblastdb.path = NULL, database.path = NULL, iter = 5, silent = TRUE, evalue = 10L, word.size = NULL, gapopen = NULL, gapextend = NULL, matrix = "BLOSUM62", threshold = NULL, seg = "no", soft.masking = FALSE, culling.limit = NULL, best.hit.overhang = NULL, best.hit.score.edge = NULL, xdrop.ungap = NULL, xdrop.gap = NULL, xdrop.gap.final = NULL, window.size = NULL, gap.trigger = 22L, num.threads = 1L, pseudocount = 0L, inclusion.ethresh = 0.002)
1
,
i.e. the first amino acid of the given sequence.nchar(seq)
,
i.e. the last amino acid of the given sequence.psiblast
program.
If NCBI Blast+ was previously installed in the operation system,
the path will be automatically detected.makeblastdb
program.
If NCBI Blast+ was previously installed in the system,
the path will be automatically detected.TRUE
.10
.'BLOSUM62'
).'yes'
,
'window locut hicut'
, or 'no'
to disable) Default is 'no'
.FALSE
.best.hit.overhang
and
best_hit_score_edge
.culling_limit
.culling_limit
.0
.22
.1
.0
.0.002
.end.pos - start.pos + 1
columns and 20
named rows.
Ye, Xugang, Guoli Wang, and Stephen F. Altschul. "An assessment of substitution scores for protein profile-profile comparison." Bioinformatics 27.24 (2011): 3356--3363.
Rangwala, Huzefa, and George Karypis. "Profile-based direct kernels for remote homology detection and fold recognition." Bioinformatics 21.23 (2005): 4239--4247.
x = readFASTA(system.file('protseq/P00750.fasta', package = 'Rcpi'))[[1]]
dbpath = tempfile('tempdb', fileext = '.fasta')
invisible(file.copy(from = system.file('protseq/Plasminogen.fasta', package = 'Rcpi'), to = dbpath))
pssmmat = extractProtPSSM(seq = x, database.path = dbpath)
dim(pssmmat) # 20 x 562 (P00750: length 562, 20 Amino Acids)
Run the code above in your browser using DataLab