Learn R Programming

BIGr (version 0.6.2)

filterMADC: Filter MADC Files

Description

Filter and process MADC files to remove low quality microhaplotypes

Usage

filterMADC(
  madc_file,
  min.mean.reads = NULL,
  max.mean.reads = NULL,
  max.mhaps.per.loci = NULL,
  min.reads.per.site = 1,
  min.ind.with.reads = NULL,
  target.only = FALSE,
  n.summary.columns = NULL,
  output.file = NULL
)

Value

data.frame or saved csv file

Arguments

madc_file

Path to the MADC file to be filtered

min.mean.reads

Minimum mean read depth for filtering

max.mean.reads

Maximum mean read depth for filtering

max.mhaps.per.loci

Maximum number of matching mhaps per target loci. Retains only the target Ref and Alt loci at the sites that exceeds the max.mhaps.per.loci threshold.

min.reads.per.site

Minimum number of reads per site for min.ind.with.reads. Otherwise, this parameter is ignored

min.ind.with.reads

Minimum number of individuals with min.reads.per.site reads for filtering

target.only

Logical indicating whether to filter for target loci only

n.summary.columns

(optional) Number of summary columns to remove from MADC file not including the first three. Otherwise, the columns will be automatically detected and removed.

output.file

Path to save the filtered data (if NULL, data will not be saved)

Details

This function can filter raw MADC files or pre-processed MADC files with fixed allele IDs. Additionally, it can filter based on mean read depth, number of mhaps per target loci, and other criteria. Optionally, users can plot summary statistics and save the filtered data to a file.

Examples

Run this code
#Example

#Example MADC
madc_file <- system.file("example_MADC_FixedAlleleID.csv", package="BIGr")

#Remove mhaps exceeding 3 per target region including the ref and alt target mhaps
filtered_df <- filterMADC(madc_file,
                         min.mean.reads = NULL,
                         max.mean.reads = NULL,
                         max.mhaps.per.loci = 3,
                         min.reads.per.site = 1,
                         min.ind.with.reads = NULL,
                         target.only = FALSE,
                         n.summary.columns = NULL,
                         output.file = NULL)



Run the code above in your browser using DataLab