Usage
getDistrs(DB, bam, pbam, islandid=NULL, verbose=FALSE, nreads=4*10^6,
readLength, min.gt.freq = NULL, tgroups=5, mc.cores=1)
Arguments
DB
Annotated genome. Object of class knownGenome
as returned by procGenome
.
bam
Aligned reads, as returned by scanBam
. It must be a list with
elements 'qname', 'rname', 'pos' and 'mpos'. Ignored when argument pbam
is
specified.
pbam
Processed BAM object of class procBam
, as returned by
function procBam
. Arguments bam
and readLength
are ignored when pbam
is specified.
islandid
Island IDs of islands to be used in the read start distribution
calculations (defaults to genes with only one annotated variant)
verbose
Set to TRUE
to print progress information.
nreads
To speed up computations, only the first nreads
are used to
obtain the estimates. The default value of 4 milions usually gives
highly precise estimates.
readLength
Read length in bp, e.g. in a paired-end experiment where
75bp are sequenced on each end one would set readLength=75
.
min.gt.freq
The target distributions cannot be estimated with
precision for gene types that are very unfrequent.
Gene types with relative frequency below min.gt.freq
are merged,
e.g. min.gt.freq=0.05
means gene types making up for 5% of the
genes in DB will be combined and a single read start and length distribution
will be estimated for all of them.
tgroups
As an alternative to min.gt.freq
you may specify
the maximum number of distinct gene types to consider.
A separate estimate will be obtained for the tgroups
with
highest frequency, all others will be combined.
mc.cores
Number of cores to use for parallel processing