whichSignatures: Which signatures are present

Description

Determines how much of each signature is present in the sample given

Usage

whichSignatures(tumor.ref = NA, sample.id, signatures.ref = signatures.nature2013, associated = c(), signatures.limit = NA, signature.cutoff = 0.06, contexts.needed = FALSE, tri.counts.method = "default")

Arguments

tumor.ref

Either a data frame or location of input text file, where rows are samples, columns are trinucleotide contexts

sample.id

Name of sample -- should be rowname of tumor.ref. Optional if the tumor.ref contains one single sample

signatures.ref

Either a data frame or location of signature text file, where rows are signatures, columns are trinucleotide contexts

associated

Vector of associated signatures. If given, will narrow the signatures tested to only the ones listed.

signatures.limit

Number of signatures to limit the search to

signature.cutoff

Discard any signature contributions with a weight less than this amount

contexts.needed

FALSE if tumor.file is a context file, TRUE if it is only mutation counts

tri.counts.method

Set to either:

'default' -- no further normalization
'exome' -- normalized by number of times each trinucleotide context is observed in the exome
'genome' -- normalized by number of times each trinucleotide context is observed in the genome
'exome2genome' -- multiplied by a ratio of that trinucleotide's occurence in the genome to the trinucleotide's occurence in the exome
'genome2exome' -- multiplied by a ratio of that trinucleotide's occurence in the exome to the trinucleotide's occurence in the genome
data frame containing user defined scaling factor -- count data for each trinucleotide context is multiplied by the corresponding value given in the data frame

Value

A list of the weights for each signatures, the product when those are multiplied on the signatures, the difference between the tumor sample and product, the tumor sample tricontext distribution given, and the unknown weight.

Normalization

If the input data frame only contains the counts of the mutations observed in each context, then the data frame must be normalized. In these cases, the value of `contexts.needed` should be TRUE. The parameter, `tri.counts.method`, determines any additional normalization performed. Any user provided data frames should match the format of `tri.counts.exome` and `tri.counts.genome`. The method of normalization chosen should match how the input signatures were normalized. For exome data, the 'exome2genome' method is appropriate for the signatures included in this package. For whole genome data, use the 'default' method to obtain consistent results.

Examples

Run this code

test = whichSignatures(tumor.ref = randomly.generated.tumors,
                       sample.id = "2", 
                       contexts.needed = FALSE)

Run the code above in your browser using DataLab