plmDE object containing preprocessed/normalized measures of the expression of a set of genes under different conditions as well as related values of quantitatively-measured covariates of interest, fitGAPLM tests each gene for differential expression under a model specified by the user. The test is conducted based on the significance of a full Model fit to the expression data when compared with the fit of a reduced model (F statistic). The variables of interest should be present in the full model and absent in the reduced. This method is very flexible and can fit count data (eg. expression measures from high-throughput sequencing) as well as microarray data. Using fitGAPLM, the user can choose to model the gene expression measures by any mixture of additive functions of the numerical variables with linear terms of the factorial information available. Each of these functions is approximated through a B-spline fit with the intercept of the spline constrained at zero for identifiability. Although fitGAPLM seems to take in a daunting amount of input, many of the inputs already set to sensible defaults, and models of the complexity represented in this class must be well thought out and each parameter requires careful consideration.
fitGAPLM(dataObject, generalizedLM = FALSE, family = poisson(link = log), NegativeBinomialUnknownDispersion = FALSE, test = "LRT", weights = NULL,
offset = NULL, pValueAdjustment = "fdr", significanceLevel = 0.05,
indicators.fullModel = as.character(unique(dataObject$sampleInfo[,2])[-1]),
continuousCovariates.fullModel = NULL,
groups.fullModel = as.character(unique(dataObject$sampleInfo[,2])[-1]),
groupFunction.fullModel = rep("AdditiveSpline", length(groups.fullModel)),
fitSplineFromData.fullModel = TRUE,
splineDegrees.fullModel = rep(3, length(groups.fullModel)),
splineKnots.fullModel = rep(0, length(groups.reducedModel)),
compareToReducedModel = FALSE,
indicators.reducedModel = as.character(unique(dataObject$sampleInfo[,2])[-1]),
continuousCovariates.reducedModel = NULL,
groups.reducedModel = as.character(unique(dataObject$sampleInfo[,2])[-1]), groupFunction.reducedModel = rep("AdditiveSpline",
length(groups.reducedModel)), fitSplineFromData.reducedModel = TRUE, splineDegrees.reducedModel = rep(3, length(groups.reducedModel)),
splineKnots.reducedModel = rep(0, length(groups.reducedModel)),
splineKnotSpread = "quantile")plmDE containing the gene expression and sample information.
TRUE, a link function is introduced to generalize the linear model. Use for gene-level count data.
glm. For gene-level count data, the negative binomial (see negative.binomial) is recommended to account for over dispersion.
TRUE, then glm.nb from the MASS package is called, which includes routines for fitting the GLM and estimating the dispersion parameter.
stat.anova for details.
NULL or a numeric factor.
offset terms may be included in the model.
p.adjust
dataObject. Under the default setting, the indicators will consist of all groups except for the first one (used as the baseline for comparison).
dataObject.
continuousCovariates to their expression levels in dataObject.
groups.fullModel which contains consists of strings matching: "AdditiveSpline", "AdditiveLinear", "CommonSpline", or "CommonLinear". If AdditiveSpline is chosen, then a B-spline basis is fitted to the continuousCovariate values of the corresponding group in groups.fullModel to estimate a function that represents the effect of this group's continuousCovariate values on their measured expression levels. This function implicitly assumes an indicator term so it evaluates to 0 for the measurements of continuousCovariate from other groups, and its overall effects are assumed to be additive with respect to the other parameters being estimated. If "AdditiveLinear" is selected, then this function is taken to be the identity function (no spline basis fit) times a parameter to be fit by the model. To estimate one function to account for the same effect across multiple groups, they must all be listed in groups.fullModel and their corresponding index in goupFunction must be set to "CommonSpline". Likewise to assume a linear effect across multiple groups, they must also be listed in groups.fullModel and the corresponding indices of groupFunction must read "CommonLinear",
fitBspline?
fitSplineFromData.fullModel has not been selected, then the user may specify, in a vector format, the degree of each B-spline basis that is fitted to the groups.
fitSplineFromData.fullModel has not been selected, then the user may also specify, in a vector, the number of knots to include in each corresponding basis.
TRUE, then the user must specify a model that the full model should be tested against. Otherwise, the all terms (besides intercept) of the full model are simultaneously tested for significance.
fitBSpline method.
DEresults containing various information about the analysis.
limmaPLM for analysis of microarray data.
fitBspline for default spline fitting heuristic.
## create an object of type \code{plmDE} containing disease with
## "control" and "disease" and measures of weight and severity:
ExpressionData = as.data.frame(matrix(abs(rnorm(10000, 1, 1.5)), ncol = 100))
names(ExpressionData) = sapply(1:100, function(x) paste("Sample", x))
Genes = sapply(1:100, function(x) paste("Gene", x))
DataInfo = data.frame(sample = names(ExpressionData), group = c(rep("Control", 50),
rep("Diseased", 50)), weight = abs(rnorm(100, 50, 20)), severity = c(rep(0, 50),
abs(rnorm(50, 100, 20))))
plmDEobject = plmDEmodel(Genes, ExpressionData, DataInfo)
## test whether severity and the indicator variable
## for disease are simultaneously significant:
test = fitGAPLM(plmDEobject, continuousCovariates.fullModel =
c("weight", "severity"), compareToReducedModel = TRUE,
indicators.reducedModel = NULL, continuousCovariates.reducedModel = "weight")
Run the code above in your browser using DataLab