Learn R Programming

Uniquorn (version 1.0.8)

identify_vcf_file: identify_VCF_file

Description

Identifies a cancer cell lines contained in a vcf file based on the pattern (start & length) of all contained mutations/ variations.

Usage

identify_vcf_file( vcf_file, output_file = "", ref_gen = "GRCH37", minimum_matching_mutations = 0, mutational_weight_inclusion_threshold = 1.0, only_first_candidate = FALSE, write_xls = FALSE, output_bed_file = FALSE, manual_identifier_bed_file = "", verbose = FALSE, p_value = .05, q_value = .05, confidence_score = 25.0)

Arguments

vcf_file
Input vcf file. Only one sample column allowed.
output_file
Path of the output file. If blank, autogenerated as name of input file plus '_uniquorn_ident.tab' suffix.
ref_gen
Reference genome version. All training sets are associated with a reference genome version. Default: GRCH37
minimum_matching_mutations
The minimum amount of mutations that has to match between query and training sample for a positive prediction
mutational_weight_inclusion_threshold
Include only mutations with a weight of at least x. Range: 0.0 to 1.0. 1= unique to CL. ~0 = found in many CL samples.
only_first_candidate
Only the CL identifier with highest score is predicted to be present in the sample
write_xls
Create identification results additionally as xls file for easier reading
output_bed_file
If BED files for IGV visualization should be created for the Cancer Cell lines that pass the threshold
manual_identifier_bed_file
Manually enter a vector of CL name(s) whose bed files should be created, independently from them passing the detection threshold
verbose
Print additional information
p_value
Required p-value for identification
q_value
Required q-value for identification
confidence_score
Threshold above which a positive prediction occurs default 25.0

Value

R table with a statistic of the identification result

Details

identify_vcf_file parses the vcf file and predicts the identity of the sample

Examples

Run this code
HT29_vcf_file = system.file("extdata/HT29.vcf.gz", package="Uniquorn");

identification = identify_vcf_file( HT29_vcf_file )

Run the code above in your browser using DataLab