Disease-Specific Genomic Analysis (DGSA). This analysis, developed by Nicolau et al., allows the calculation of the "disease component" of a expression matrix which consists of, through linear models, eliminating the part of the data that is considered normal or healthy and keeping only the component that is due to the disease. It is intended to precede other techniques like classification or clustering. For more information see Disease-specific genomic analysis: identifying the signature of pathologic biology (doi: 10.1093/bioinformatics/btm033).
DGSA(full_data, survival_time, survival_event, case_tag, na.rm = TRUE)A DGSA object. It contains: the full_data without NAN's
values, the case_control vector without NAN's values, the label
designated for healthy samples (control_tag), the matrix with the
normal space (linear space generated from normal tissue samples) and the
matrix of the disease components (the transformed full_data matrix from
which the normal component has been removed).
Input matrix whose columns correspond to the patients and rows to the genes.
Numerical vector of the same length as the number of columns of full_data. Patients must be in the same order as in full_data. For the patients with tumour sample should be indicated the time between disease diagnosis and death (if not dead until the end of follow-up) and healthy patients must have an NA value.
Numerical vector of the same length as the number of columns of full_data. Patients must be in the same order as in full_data. For the patients with tumour sample should be indicated whether the patient has died (1) or not (0). Only these values are valid and healthy patients must have an NA value.
Character vector of the same length as the number of columns of full_data. Patients must be in the same order as in full_data. It must be indicated for each patient whether he/she is healthy or not. One value should be used to indicate whether the patient is healthy and another value should be used to indicate whether the patient's sample is tumourous. The user will then be asked which one indicates whether the patient is healthy. Only two values are valid in the vector in total.
logical. If TRUE, NA rows are omitted.
If FALSE, an error occurs in case of NA rows. TRUE default
option.
# \donttest{
DGSA_obj <- DGSA(full_data, survival_time, survival_event, case_tag)# }
Run the code above in your browser using DataLab