Learn R Programming

ctDNAtools (version 0.4.0)

get_background_rate: Computes the background mismatch rate from a bam file

Description

Runs through the target regions base by base counting the mismatches. Then it divides sum(mismatches)/sum(depths) for all bases in the targets

Usage

get_background_rate(bam, targets, reference, vaf_threshold = 0.1,
  tag = "", black_list = NULL, substitution_specific = TRUE,
  min_base_quality = 20, max_depth = 1e+05, min_mapq = 30)

Arguments

bam

path to bam file

targets

a data frame with the target regions. Must have three columns: chr, start and end

reference

the reference genome in BSgenome format

vaf_threshold

the bases with higher than this VAF threshold will be ignored in the calculation (real mutations)

tag

the RG tag if the bam has more than one sample

black_list

a character vector of genomic loci of format chr_pos if substitution_specific is false, or chr_pos_ref_alt if substitution_specific is true. The background will be computed on the target regions after excluding black_list loci.

substitution_specific

logical, whether to have the loci of black_list by substitutions.

min_base_quality

minimum base quality for a read to be counted

max_depth

maximum depth above which sampling will happen

min_mapq

the minimum mapping quality for a read to be counted

Value

a list containing the general mismatch rate and substitution-specific rates

Details

Computes the background rate of the input bam file for all bases in the specified targets. Substitutions-specific rates are also calculated.

Genomic positions having non-reference allele frequency higher than vaf_threshold will be excluded (to exclude SNPs and real mutations).

If a black_list is specified, the positions in the black_list (whether substitution_specific or not) will be excluded before computing the background rate.

See Also

create_black_list test_ctDNA create_background_panel

Examples

Run this code
# NOT RUN {
## Load example data
data("targets", package = "ctDNAtools")
bamT1 <- system.file("extdata", "T1.bam", package = "ctDNAtools")
bamN1 <- system.file("extdata", "N1.bam", package = "ctDNAtools")
bamN2 <- system.file("extdata", "N2.bam", package = "ctDNAtools")
bamN3 <- system.file("extdata", "N3.bam", package = "ctDNAtools")

## Use human reference genome from BSgenome.Hsapiens.UCSC.hg19 library
suppressMessages(library(BSgenome.Hsapiens.UCSC.hg19))

## basic usage
get_background_rate(bamT1, targets, BSgenome.Hsapiens.UCSC.hg19)

## more options
get_background_rate(bamT1, targets, BSgenome.Hsapiens.UCSC.hg19,
  min_base_quality = 30, min_mapq = 40, vaf_threshold = 0.05
)

## with blacklist
bg_panel <- create_background_panel(
  bam_list = c(bamN1, bamN2, bamN3),
  targets = targets, reference = BSgenome.Hsapiens.UCSC.hg19,
  substitution_specific = TRUE
)

bl2 <- create_black_list(bg_panel,
  mean_vaf_quantile = 0.99,
  min_samples_one_read = 2, min_samples_two_reads = 1
)

get_background_rate(bamT1, targets, BSgenome.Hsapiens.UCSC.hg19,
  black_list = bl2
)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab