Learn R Programming

kmeRtone (version 1.0)

kmeRtone: kmeRtone all-in-one user interface

Description

This function serves as an all-in-one interface for various genomic data analyses leveraging k-mer based techniques.

Usage

kmeRtone(
  case.coor.path,
  genome.name,
  strand.sensitive,
  k,
  ctrl.rel.pos = c(80, 500),
  case.pattern,
  output.dir = "output/",
  case,
  genome,
  control,
  control.path,
  genome.path,
  rm.case.kmer.overlaps,
  single.case.len,
  merge.replicates,
  kmer.table,
  module = "score",
  rm.dup = TRUE,
  case.coor.1st.idx = 1,
  ctrl.coor.1st.idx = 1,
  coor.load.limit = 1,
  genome.load.limit = 1,
  genome.fasta.style = "UCSC",
  genome.ncbi.db = "refseq",
  use.UCSC.chr.name = FALSE,
  verbose = TRUE,
  kmer.cutoff = 5,
  selected.extremophiles,
  other.extremophiles,
  cosmic.username,
  cosmic.password,
  tumour.type.regex = NULL,
  tumour.type.exact = NULL,
  cell.type = "somatic",
  genic.elements.counts.dt,
  population.size = 1e+06,
  selected.genes,
  add.to.existing.population = FALSE,
  population.snv.dt = NULL,
  pop.plot = TRUE,
  pop.loop.chr = FALSE
)

Value

Depends on the selected module.

Arguments

case.coor.path

Path to a folder containing chromosome-separated coordinate files or bedfiles. Assumed replicates for subfolder or bedfiles.

genome.name

Name of the genome (e.g., "hg19", "hg38"). Default is "unknown".

strand.sensitive

Logical value indicating whether strand polarity matters. Default is TRUE.

k

Length of k-mer to be investigated. Recommended values are 7 or 8.

ctrl.rel.pos

A vector of two integers specifying the relative range positions of control regions.

case.pattern

Regular expression pattern for identifying case regions. Default is NULL.

output.dir

Directory path for saving output files. Default is "output/".

case

Optional pre-built Coordinate object.

genome

Optional pre-built Genome object.

control

Optional pre-built control Coordinate object.

control.path

Path for pre-built control Coordinate object.

genome.path

Path to a directory of user-provided genome FASTA files.

rm.case.kmer.overlaps

Logical indicating whether to remove overlapping k-mers in case regions. Default is FALSE.

single.case.len

Integer indicating uniform length of case regions.

merge.replicates

Logical indicating whether to merge replicates. Default is TRUE.

kmer.table

Pre-calculated k-mer score table.

module

Selected kmeRtone module to run. Possible values include "score", "explore", "tune", among others.

rm.dup

Logical indicating whether to remove duplicate coordinates. Default is TRUE.

case.coor.1st.idx

Integer specifying indexing format for case coordinates.

ctrl.coor.1st.idx

Integer specifying indexing format for control coordinates.

coor.load.limit

Maximum number of coordinates to load. Default is 1.

genome.load.limit

Maximum number of genome sequences to load. Default is 1.

genome.fasta.style

String specifying the style of the genome FASTA. Possible values are "UCSC", "NCBI". Default is "UCSC".

genome.ncbi.db

String specifying the NCBI database to use. Possible values are "refseq", "genbank". Default is "refseq".

use.UCSC.chr.name

Logical indicating whether to use UCSC chromosome names.

verbose

Logical indicating whether to display progress messages. Default is TRUE.

kmer.cutoff

Cutoff percentage for k-mer selection in case studies. Default is 5.

selected.extremophiles

Vector of selected extremophile species for study.

other.extremophiles

Vector of other extremophile species for control.

cosmic.username

COSMIC username for accessing the cancer gene census.

cosmic.password

COSMIC password for accessing the cancer gene census.

tumour.type.regex

Regular expression pattern for filtering tumour types.

tumour.type.exact

Exact tumour type to be included in the cancer gene census.

cell.type

Cell type to be included in the cancer gene census. Default is "somatic".

genic.elements.counts.dt

Data table of susceptible k-mer counts in genic elements.

population.size

Size of the population for cross-population studies. Default is 1 million.

selected.genes

Selected genes for mutation in cross-population studies.

add.to.existing.population

Logical indicating whether to add to the existing simulated population. Default is FALSE.

population.snv.dt

Data table of single nucleotide variants used in population simulations.

pop.plot

Logical indicating whether to plot the outcome of the cross-population study. Default is TRUE.

pop.loop.chr

Logical indicating whether to loop based on chromosome name in cross-population studies. Default is FALSE.