Learn R Programming

derfinder (version 1.0.10)

analyzeChr: Run the derfinder analysis on a chromosome

Description

This is a major wrapper for running several key functions from this package. It is meant to be used after loadCoverage has been used for a specific chromosome. The steps run include makeModels, preprocessCoverage, calculateStats, calculatePvalues and annotateNearest.

Usage

analyzeChr(chr, coverageInfo, models, cutoffPre = 5, cutoffFstat = 1e-08, cutoffType = "theoretical", nPermute = 1, seeds = as.integer(gsub("-", "", Sys.Date())) + seq_len(nPermute), groupInfo, subject = "hg19", writeOutput = TRUE, runAnnotation = TRUE, lowMemDir = file.path(chr, "chunksDir"), ...)

Arguments

chr
Used for naming the output files when writeOutput=TRUE and for annotateNearest.
coverageInfo
A list containing a DataFrame --$coverage-- with the coverage data and a logical Rle --$position-- with the positions that passed the cutoff. This object is generated using loadCoverage.
models
The output from makeModels.
cutoffPre
This argument is passed to preprocessCoverage (cutoff).
cutoffFstat
This is used to determine the cutoff argument of calculatePvalues and it's behaviour is determined by cutoffType.
cutoffType
If set to empirical, the cutoffFstat (example: 0.99) quantile is used via quantile. If set to theoretical, the theoretical cutoffFstats (example: 1e-08) is calculated via qf. If set to manual, cutoffFstats is passed to calculatePvalues without any other calculation.
nPermute
The number of permutations. Note that for a full chromosome, a small amount (10) of permutations is sufficient. If set to 0, no permutations are performed and thus no null regions are used, however, the $regions component is created.
seeds
An integer vector of length nPermute specifying the seeds to be used for each permutation. If NULL no seeds are used.
groupInfo
A factor specifying the group membership of each sample that can later be used with the plotting functions in the derfinderPlot package.
subject
This argument is passed to annotateNearest. Note that only hg19 works right now.
writeOutput
If TRUE, output Rdata files are created at each step inside a directory with the chromosome name (example: 'chr21' if chrnum='21'). One Rdata file is created for each component described in the return section.
runAnnotation
If TRUE annotateNearest is run. Otherwise this step is skipped.
lowMemDir
If specified, each chunk is saved into a separate Rdata file under lowMemDir and later loaded in fstats.apply when running calculateStats and calculatePvalues. Using this option helps reduce the memory load as each fork in bplapply loads only the data needed for the chunk processing. The downside is a bit longer computation time due to input/output.
...
Arguments passed to other methods and/or advanced arguments.

Value

If returnOutput=TRUE, a list with six components:
timeinfo
The wallclock timing information for each step.
optionsStats
The main options used when running this function.
coveragePrep
The output from preprocessCoverage.
fstats
The output from calculateStats.
regions
The output from calculatePvalues.
annotation
The output from annotateNearest.
These are the same components that are written to Rdata files if writeOutput=TRUE.

Details

If you are working with data from an organism different from 'Homo sapiens' specify so by setting the global 'species' and 'chrsStyle' options. For example: options(species = 'arabidopsis_thaliana') options(chrsStyle = 'NCBI')

See Also

makeModels, preprocessCoverage, calculateStats, calculatePvalues, annotateNearest

Examples

Run this code
## Collapse the coverage information
collapsedFull <- collapseFullCoverage(list(genomeData$coverage),
    verbose = TRUE)

## Calculate library size adjustments
sampleDepths <- sampleDepth(collapsedFull, probs = c(0.5), nonzero=TRUE,
    verbose=TRUE)

## Build the models
groupInfo <- genomeInfo$pop
adjustvars <- data.frame(genomeInfo$gender)
models <- makeModels(sampleDepths, testvars=groupInfo, adjustvars=adjustvars)

## Analyze the chromosome
results <- analyzeChr(chr='21', coverageInfo=genomeData, models=models,
    cutoffFstat=1, cutoffType='manual', groupInfo=groupInfo, mc.cores=1,
    writeOutput=FALSE, returnOutput=TRUE, method='regular')
names(results)

Run the code above in your browser using DataLab