Learn R Programming

RCNA (version 1.0)

correct_gc_bias: correct_gc_bias: Estimate and correct GC bias in coverage

Description

This generic function is used to run to calculate and correct GC-content-based coverage bias

This function optionally estimates and then corrects the GC bias based on a GC-content factor file that is either generated or provided by the user using a sliding window approach. It creates a GC factor file and a corrected coverage file, both of which are placed in the output directory under `/gc`.

This function optionally estimates and then corrects the GC bias based on a GC-content factor file that is either generated or provided by the user using a sliding window approach. It creates a GC factor file and a corrected coverage file, both of which are placed in the output directory under `/gc`.

Usage

correct_gc_bias(obj, ...)

# S3 method for default correct_gc_bias( obj = NULL, df = NULL, sample.names = NULL, ano.file, out.dir = NULL, ncpus = 1, file.raw.coverage = NULL, file.corrected.coverage = NULL, file.gc.factor = NULL, win.size = 75, gc.step = 0.01, estimate_gc = TRUE, verbose = FALSE, ... )

# S3 method for RCNA_object correct_gc_bias(obj, verbose = FALSE, ...)

Value

A RCNA_analysis class object that describes the input parameters and output files generated by this step of the workflow.

A RCNA_analysis class object that describes the input parameters and output files generated by this step of the workflow.

A RCNA_analysis class object that describes the input parameters and output files generated by this step of the workflow.

Arguments

obj

A RCNA_object type object - parameters will be pulled from the object instead, specifically from the `gcParams` slot.

...

Additional arguments (unused)

df

Path to the config file, or a `data.frame` object containing the valid parameters. Valid column names are `file.raw.coverage`, `file.gc.factor`, `file.corrected.coverage`, and `sample.names`. Additional columns will be ignored.

sample.names

Character vector of sample names. Alternatively can be specified in `df`.

ano.file

Location of the annotation file. This file must be in CSV format and contain the following information (with column headers as specified): "feature,chromosome,start,end".

out.dir

Output directory for results. A subdirectory for results will be created under this + `/nkr/`.

ncpus

Integer number of CPUs to use. Specifying more than one allows this function to be parallelized by feature.

file.raw.coverage

Character vector listing the raw input coverage files. Must be the same length as `sample.names`. Alternatively can be specified in `df`.

file.corrected.coverage

Character vector listing the corrected input coverage files. If not specified new names will be generated based on the raw coverage files.

file.gc.factor

Character vector listing the GC factor files used to correct coverage. If `estimate_gc=FALSE` then this must be provided. Otherwise it is ignored.

win.size

Size in base pairs of the sliding window used to estimate and correct the GC bias.

gc.step

Bin size for GC bias in the GC factor file. If the GC factor file is provided then the file must have corresponding bin sizes.

estimate_gc

Logical determining if GC content estimation should be performed. If set to `FALSE` then a factor file must be provided via `file.gc.factor` or in `df`.

verbose

If set to TRUE will display more detail

Details

This function can be run as a stand-alone or as part of run_RCNA

The `df` argument corresponds to the `gcParams` matrix on RCNA_object. Valid column names are `sample.names`, `file.raw.coverage`, `file.corrected.coverage`, and `file.gc.factor`. The `file.gc.factor` column is not required if `estimate_gc=TRUE`. Additional columns will be ignored.

For more parameter information, see estimate_nkr.default.

See Also

RCNA_object, RCNA_analysis, run_RCNA

Examples

Run this code
## Run GC-bias estimation and correction on example object
# See \link{example_obj} for more information on example
example_obj@ano.file <- system.file("examples" ,"annotations-example.csv",
 package = "RCNA")
raw.cov <- system.file("examples", "coverage",
                       paste0(example_obj@sample.names, ".txt.gz"), package = "RCNA")
example_obj@gcParams$file.raw.coverage <- raw.cov
example_obj
# Create output directory
dir.create(file.path("output", "gc"), recursive = TRUE)
# Estimate and correct GC bias, append results
correct_gc_analysisObj <- correct_gc_bias(example_obj)
example_obj@commands <- c(example_obj@commands, correct_gc_analysisObj)
system("rm -rf output")

Run the code above in your browser using DataLab