Usage
generative.prob(blast.output.file = NULL, read.length.file = 80,
contig.weight.file = 1, gi.taxon.file = NULL, gen.prob.unknown = 1e-06,
outDir = NULL, blast.default = TRUE)generative.prob.nucl(blast.output.file = NULL, read.length.file = 80,
contig.weight.file = 1, gi.taxon.file, gen.prob.unknown = 1e-20,
outDir = NULL, genomeLength = NULL, blast.default = TRUE)
Arguments
blast.output.file
This is the tabular BLASTx output format for generative.prob(), while it is the tabular BLASTn output format for generative.prob.nucl(). It can either be the default output format or a specific custom output format, incorporating read length and taxon id
read.length.file
This argument can either be a file mapping each read to its length or a numerical value, representing the average read length.
contig.weight.file
This argument can either be a file where weights are assigned to reads and contigs. For unassembled reads the weight is equal to 1 while for contigs the weight should reflect the number of reads that assembled it.
gi.taxon.file
For generative.prob() this would be the 'gi_taxid_prot.dmp' taxonomy file, mapping each protein gi identifier to the corresponding taxon identifier. It can be downloaded from ftp://ftp.ncbi.nih.gov/pub/taxonomy/gi_taxid_prot.dmp.gz . For genera
gen.prob.unknown
User-defined generative probability for unknown category. Default value for generative.prob() is 1e-06, while for generative.prob.nucl() is 1e-20.
blast.default
logical. Is the input the default blast output tabular format? Default value is TRUE
genomeLength
This is applicable only for generative.prob.nucl() . It is a file mapping each genome/nucleotide to its respective length. The file must be tab seperated and the first column the nucleotide gi identifier and the second the corresponding sequence length. I