Simulation of human population is based on single nucleotide variantion.
STUDY_ACROSS_POPULATIONS(
kmer.table,
kmer.cutoff = 5,
genome.name,
k,
db = "refseq",
central.pattern = NULL,
population.size = 1e+06,
selected.genes,
add.to.existing.population = FALSE,
output.dir = "study_across_populations/",
population.snv.dt = NULL,
loop.chr = TRUE,
plot = FALSE,
fasta.path
)
An output directory containing plots.
A data.table of kmer table.
Percentage of extreme kmers to study. Default to 5.
UCSC genome name.
K-mer size.
Database used by UCSC to generate gene prediction: "refseq" or "gencode". Default is "refseq".
K-mer's central patterns. Default is NULL.
Size of population to simulate. Default is 1 million.
Set of genes to study e.g. skin cancer genes.
Add counts to counts.csv? Default is FALSE.
A directory for the outputs. Default to study_across_populations.
Population SNV table.
Loop chromosome?. Default is TRUE. If FALSE, beware of a memory spike because of VCF content. VCF contains zero counts for every population. Input pre-computed trimmed-version population.snv.dt.
Boolean. Default is FALSE. If TRUE, will plot results.
Path to a directory of user-provided genome FASTA files or the destination to save the NCBI/UCSC downloaded reference genome files.