A function to fit a generative model to a mutation dataset. At its heart, requires a gene_lengths dataframe (for examples of the correct format for this see the pre-loaded datasets example_maf_data$gene_lengths and ensembl_gene_lengths), and a mutation dataset. This is best supplied through the 'table' argument, and constructed via the function get_mutation_tables().
fit_gen_model(
gene_lengths,
matrix = NULL,
sample_list = NULL,
gene_list = NULL,
mut_types_list = NULL,
col_names = NULL,
table = NULL,
nlambda = 100,
n_folds = 10,
maxit = 1e+09,
seed_id = 1234,
progress = FALSE,
alt_model_type = NULL
)(dataframe) A table with two columns: Hugo_Symbol and max_cds, providing the lengths of the genes to be modelled.
(Matrix::sparseMatrix) A mutation matrix, such as produced by the function get_table_from_maf().
(character) The set of samples to be modelled.
(character) The set of genes to be modelled.
(character) The set of mutation types to be modelled.
(character) The column names of the 'matrix' parameter.
(list) Optional parameter combining matrix, sample_list, gene_list, mut_types_list, col_names, as is produced by the function get_tables().
(numeric) The length of the vector of penalty weights, passed to the function glmnet::glmnet().
(numeric) The number of cross-validation folds to employ.
(numeric) Technical parameter passed to the function glmnet::glmnet().
(numeric) Input value for the function set.seed().
(logical) Show progress bars and text.
(character) Used to call an alternative generative model type such as "US" (no sample-dependent parameters) or "UI" (no gene/variant-type interactions).
A list comprising three objects:
An object 'fit', a fitted glmnet model.
A table 'dev', giving average deviances for each regularisation penalty factor and cross-validation fold.
An integer 's_min', the index of the regularsisation penalty minimising cross-validation deviance.
A list 'names', containing the sample, gene, and mutation type information of the training data.
# NOT RUN {
example_gen_model <- fit_gen_model(example_maf_data$gene_lengths, table = example_tables$train)
print(names(example_gen_model))
# }
Run the code above in your browser using DataLab