TcGSA.LR to be used on a
cluster of computing processors. This function computes the Likelihood
Ratios for the gene sets under scrutiny, as well as estimations of genes
dynamics inside those gene sets through mixed models.
TcGSA.LR.parallel(Ncpus, type_connec, expr, gmt, design, subject_name = "Patient_ID", time_name = "TimePoint", crossedRandom = FALSE, covariates_fixed = "", time_covariates = "", time_func = "linear", group_name = "", separateSubjects = FALSE, minGSsize = 10, maxGSsize = 500, monitorfile = "")"SOCK", "PVM", "MPI", and
"NWS". See also makeCluster.GSA.read.gmt and definition on
www.broadinstitute.org.subject_name, time_name, and covariates_fixed
and time_covariates if applicable. Its dimension are $p$x$m$
and its row are is in the same order as the columns of expr.design that contains the information on
the repetition units used in the mixed model, such as the patient identifiers for instance.
Default is 'Patient_ID'. See Details.design contains
the information on the time replicates (the time points at which gene
expression was measured). Default is 'TimePoint'. See Details.FALSE. See details.design
matrix that should appear as fixed effects in the model. See details.
Default is "", which corresponds to no covariates in the model.design that contains
the information on the time replicates (the time points at which gene
expression was measured). Default is 'TimePoint'. See Details."linear",
"cubic", "splines" or specified by the user, or the column name of
a factor variable from design. If specified by the user,
it must be as an expression using only names of variables from the design matrix
with only the three following operators: +, *, / .
The "splines" form corresponds to the natural cubic B-splines
(see also ns). If there are only a few timepoints,
a "linear" form should be sufficient. Otherwise, the "cubic" form is
more parsimonious than the "splines" form, and should be sufficiently flexible.
If the column name of a factor variable from design is supplied,
then time is considered as discrete in the analysis.
If the user specify a formula using column names from design, both factor and numeric
variables can be used.design matrix. It indicates to which treatment group each sample
belongs to. Default is "", which means that there is only one
treatment group. See Details.FALSE. See Details.10 genes as the minimum.500 genes as the maximum."" which is no monitoring. See Details.TcGSA.LR returns a tcgsa object, which is a list with
the 5 following elements:
LR: the likelihood ratio between the model under the
null hypothesis and the model under the alternative hypothesis.
CVG_H0: convergence status of the model under the null hypothesis.
CVG_H1: convergence status of the model under the alternative
hypothesis.
time_func: a character string passing along the value of the
time_func argument used in the call.
GeneSets_gmt: a gmt object passing along the value of the
gmt argument used in the call.
group.var: a factor passing along the group_name variable
from the design matrix.
separateSubjects: a logical flag passing along the value of the
separateSubjects argument used in the call.
Estimations: a list of 3 dimensions arrays. Each element of the
list (i.e. each array) corresponds to the estimations of gene expression
dynamics for each of the gene sets under scrutiny (obtained from mixed
models). The first dimension of those arrays is the genes included in the
concerned gene set, the second dimension is the Patient_ID, and the
third dimension is the TimePoint. The values inside those arrays are
estimated gene expressions.
time_DF: the degree of freedom of the natural splines functions
separatePatients is TRUE, instead of identifying gene sets that
have a significant trend over time (possibly with probes heterogeneity of
this trend), TcGSA identifies gene sets that have significantly
different trends over time depending on the patient.If the monitorfile argument is a character string naming a file to
write into, in the case of a new file that does not exist yet, such a new
file will be created. A line is written each time one of the gene sets under
scrutiny has been analysed (i.e. the two mixed models have been fitted, see
TcGSA.LR) by one of the parallelized processors.
summary.TcGSA, plot.TcGSA
data(data_simu_TcGSA)
tcgsa_sim_1grp <- TcGSA.LR(expr=expr_1grp, gmt=gmt_sim, design=design,
subject_name="Patient_ID", time_name="TimePoint",
time_func="linear", crossedRandom=FALSE)
## Not run:
# require(doParallel)
# tcgsa_sim_1grp <- TcGSA.LR.parallel(Ncpus = 2, type_connec = 'SOCK',
# expr=expr_1grp, gmt=gmt_sim, design=design,
# subject_name="Patient_ID", time_name="TimePoint",
# time_func="linear", crossedRandom=FALSE,
# separateSubjects=TRUE)
# ## End(Not run)
tcgsa_sim_1grp
summary(tcgsa_sim_1grp)
Run the code above in your browser using DataLab