Learn R Programming

GSSTDA (version 0.1.3)

gene_selection: gene_selection

Description

Private function to gene selection

Usage

gene_selection(
  full_data,
  survival_time,
  survival_event,
  control_tag_cases,
  gen_select_type,
  num_gen_select
)

Value

A geneSelection_object. It contains:

  • the full_data without NAN's values (data)

  • the cox_all_matrix (a matrix with the results of the application of proportional hazard models: with the regression coefficients, the odds ratios, the standard errors of each coefficient, the Z values (coef/se_coef) and the p-values for each Z value)

  • a vector with the name of the selected genes

  • the matrix of disease components with only the rows of the selected genes (genes_disease_component)

  • and the vector of the values of the filter function.

Arguments

full_data

Input matrix whose columns correspond to the patients and rows to the genes.

survival_time

Numerical vector of the same length as the number of columns of full_data. Patients must be in the same order as in full_data. For the patients with tumour sample should be indicated the time between disease diagnosis and death (if not dead until the end of follow-up) and healthy patients must have an NA value.

survival_event

Numerical vector of the same length as the number of columns of full_data. Patients must be in the same order as in full_data. For the patients with tumour sample should be indicated whether the patient has died (1) or not (0). Only these values are valid and healthy patients must have an NA value.

control_tag_cases

Character vector of the same length as the number of columns of full_data. Patients must be in the same order as in full_data. It must be indicated for each patient whether he/she is healthy or not. One value should be used to indicate whether the patient is healthy and another value should be used to indicate whether the patient's sample is tumourous. The user will then be asked which one indicates whether the patient is healthy. Only two values are valid in the vector in total.

gen_select_type

Option. Options on how to select the genes to be used in the mapper. Select the "Abs" option, which means that the genes with the highest absolute value are chosen, or the "Top_Bot" option, which means that half of the selected genes are those with the highest value (positive value, i.e. worst survival prognosis) and the other half are those with the lowest value (negative value, i.e. best prognosis). "Top_Bot" default option.

num_gen_select

Number of genes to be selected to be used in mapper.

Examples

Run this code
# \donttest{
gen_select_type <- "Top_Bot"
percent_gen_select <- 10
control_tag_cases <- which(case_tag == "NT")
geneSelection_obj <- gene_selection(full_data, survival_time, survival_event, control_tag_cases,
gen_select_type ="top_bot", num_gen_select = 10)# }

Run the code above in your browser using DataLab