Learn R Programming

CODEX (version 1.4.0)

qc: Quality control procedure for depth of coverage

Description

Applies a quality control procedure to the depth of coverage matrix both sample-wise and exon-wise before normalization.

Usage

qc(Y, sampname, chr, ref, mapp, gc,cov_thresh,length_thresh,mapp_thresh, gc_thresh)

Arguments

Y
Original read depth matrix returned from getcoverage
sampname
Vector of sample names returned from getbambed
chr
Chromosome.
ref
IRanges object specifying exonic positions returned from getbambed
mapp
Vector of mappability for each exon returned from getmapp
gc
Vector of GC content for each exon returned from getgc
cov_thresh
Vector specifying the upper and lower bound of exonic median coverage threshold for QC. 20-4000 recommended.
length_thresh
Vector specifying the upper and lower bound of exonic length threshold for QC. 20-2000 recommended.
mapp_thresh
Scalar variable specifying exonic mappability threshold for QC. 0.9 recommended.
gc_thresh
Vector specifying the upper and lower bound of exonic GC content threshold for QC. 20-80 recommended.

Value

Y_qc
Updated Y after QC
sampname_qc
Updated sampname after QC
gc_qc
Updated gc after QC
mapp_qc
Updated mapp after QC
ref_qc
Updated ref after QC
qcmat
Matrix specifying results of exon-wise QC procedures

Details

It is suggested that analysis by CODEX be carried out in a batch-wise fashion if multiple batches exist. CODEX further filters out exons that: have extremely low coverage--median read depth across all samples less than 20 or greater than 4000; are extremely short--less than 20 bp; are extremely hard to map-- mappability less than 0.9; have extreme GC content--less than 20 or greater than 80. The above filtering thresholds are recommended and can be user-defined to be adapted to different sequencing protocols.

See Also

getbambed, getgc, getmapp

Examples

Run this code
Y <- coverageObjDemo$Y
sampname <- bambedObjDemo$sampname
chr <- bambedObjDemo$chr
ref <- bambedObjDemo$ref
gc <- gcDemo
mapp <- mappDemo
cov_thresh <- c(20, 4000)
length_thresh <- c(20, 2000)
mapp_thresh <- 0.9
gc_thresh <- c(20, 80)
qcObj <- qc(Y, sampname, chr, ref, mapp, gc, cov_thresh, length_thresh, 
    mapp_thresh, gc_thresh)
Y_qc <- qcObj$Y_qc
sampname_qc <- qcObj$sampname_qc
gc_qc <- qcObj$gc_qc
mapp_qc <- qcObj$mapp_qc
ref_qc <- qcObj$ref_qc
qcmat <- qcObj$qcmat

Run the code above in your browser using DataLab