Learn R Programming

sigminer (version 1.0.6)

sig_fit: Fit Signature Exposures with Linear Combination Decomposition

Description

The function performs a signatures decomposition of a given mutational catalogue V with known signatures W by solving the minimization problem min(||W*H - V||) where W and V are known.

Usage

sig_fit(
  catalogue_matrix,
  sig,
  sig_index = NULL,
  sig_db = "legacy",
  db_type = c("", "human-exome", "human-genome"),
  show_index = TRUE,
  method = c("QP", "LS", "SA"),
  type = c("absolute", "relative"),
  return_class = c("matrix", "data.table"),
  return_error = FALSE,
  rel_threshold = 0,
  mode = c("SBS", "DBS", "ID", "copynumber"),
  true_catalog = NULL,
  ...
)

Arguments

catalogue_matrix

a numeric matrix V with row representing components and columns representing samples, typically you can get nmf_matrix from sig_tally() and transpose it by t().

sig

a Signature object obtained either from sig_extract or sig_auto_extract, or just a raw signature matrix with row representing components (motifs) and column representing signatures.

sig_index

a vector for signature index. "ALL" for all signatures.

sig_db

can be 'legacy' (for COSMIC v2 'SBS'), 'SBS', 'DBS', 'ID' and 'TSB' (for SBS transcriptional strand bias signatures). Default 'legacy'.

db_type

only used when sig_db is enabled. "" for keeping default, "human-exome" for transforming to exome frequency of component, and "human-genome" for transforming to whole genome frequency of component. Currently only works for 'SBS'.

show_index

if TRUE, show valid indices.

method

method to solve the minimazation problem. 'LS' for least square; 'QP' for quadratic programming; 'SA' for simulated annealing.

type

'absolute' for signature exposure and 'relative' for signature relative exposure.

return_class

string, 'matrix' or 'data.table'.

return_error

if TRUE, also return method error (Frobenius norm). NOTE: it is better to obtain the error when the type is 'absolute', because the error is affected by relative exposure accuracy.

rel_threshold

numeric vector, a relative exposure lower than this value will be set to 0. Of note, this is a little different from the same parameter in get_sig_exposure.

mode

signature type for plotting, now supports 'copynumber', 'SBS', 'DBS' and 'ID'.

true_catalog

used by sig_fit_bootstrap, user never use it.

...

control parameters passing to argument control in GenSA function when use method 'SA'.

Value

The exposure result either in matrix or data.table format. If return_error set TRUE, a list is returned.

Details

The method 'LS' is a modification based on LCD function from YAPSA pakcage. The method 'QP' and 'SA' are modified from SignatureEstimation package. See references for details.

References

Daniel Huebschmann, Zuguang Gu and Matthias Schlesner (2019). YAPSA: Yet Another Package for Signature Analysis. R package version 1.12.0.

Huang X, Wojtowicz D, Przytycka TM. Detecting presence of mutational signatures in cancer with confidence. Bioinformatics. 2018;34(2):330<U+2013>337. doi:10.1093/bioinformatics/btx604

See Also

sig_extract, sig_auto_extract, sig_fit_bootstrap, sig_fit_bootstrap_batch

Examples

Run this code
# NOT RUN {
W <- matrix(c(1, 2, 3, 4, 5, 6), ncol = 2)
colnames(W) <- c("sig1", "sig2")
W <- apply(W, 2, function(x) x / sum(x))

H <- matrix(c(2, 5, 3, 6, 1, 9, 1, 2), ncol = 4)
colnames(H) <- paste0("samp", 1:4)

V <- W %*% H
V

if (requireNamespace("quadprog", quietly = TRUE)) {
  H_infer <- sig_fit(V, W, method = "QP")
  H_infer
  H

  H_dt <- sig_fit(V, W, method = "QP", return_class = "data.table")
  H_dt

  ## Show results
  show_sig_fit(H_infer)
  show_sig_fit(H_dt)

  ## Get clusters/groups
  H_dt_rel <- sig_fit(V, W, return_class = "data.table", type = "relative")
  z <- get_groups(H_dt_rel, method = "k-means")
  show_groups(z)
}

if (requireNamespace("GenSA", quietly = TRUE)) {
  H_infer <- sig_fit(V, W, method = "SA")
  H_infer
  H

  H_dt <- sig_fit(V, W, method = "SA", return_class = "data.table")
  H_dt

  ## Modify arguments to method
  sig_fit(V, W, method = "SA", maxit = 10, temperature = 100)

  ## Show results
  show_sig_fit(H_infer)
  show_sig_fit(H_dt)
}
# }

Run the code above in your browser using DataLab