get_mutation_tables

(dataframe)
A table of annotated mutations containing the columns 'Tumor_Sample_Barcode', 'Hugo_Symbol', and 'Variant_Classification'.

(double)
A vector of three positive values with names 'train', 'val' and 'test'. Specifies the proportions into which to split the dataset.

split

sample_list (character)
Optional parameter specifying the set of samples to include in the mutation matrices.

sample_list

(character)
Optional parameter specifying the set of genes to include in the mutation matrices.

gene_list

(character)
Optional parameter specifying a set of acceptable genes, for example those which are in an ensembl databse.

acceptable_genes

(character)
Used for defining a dictionary of mutations. See the function get_mutation_dictionary() for details.

for_biomarker

(logical)
Optional parameter specifying whether to include synonymous mutations in the mutation matrices.

include_synonymous

(character)
Optional parameter directly specifying the mutation dictionary to use. See the function get_mutation_dictionary() for details.

dictionary

(numeric)
Input value for the function set.seed().

seed_id

This function allows for i) separation of a mutation dataset into training, validation and testing components, and ii) conversion from annotated mutation format
to sparse mutation matrices, as described in the function get_table_from_maf().

Implementation of the methodology proposed in 'Data-driven design of targeted gene panels for estimating immunotherapy biomarkers', Bradley and Cannings (2021) <arXiv:2102.04296>. This package allows the user to fit generative models of mutation from an annotated mutation dataset, and then further to produce tunable linear estimators of exome-wide biomarkers. It also contains functions to simulate mutation annotated format (MAF) data, as well as to analyse the output and performance of models.

get_mutation_tables: Produce Training, Validation and Test Matrices

Description

Usage

Arguments

Value

Examples