lmFit(object, design=NULL, ndups=1, spacing=1, block=NULL, correlation, weights=NULL, method="ls", ...)
getEAWP
is acceptable.spacing=1
for consecutive rows.NULL
if ndups>2
.ncol
of the expression matrix, or a numeric vector of gene weights with length equal to nrow
of the expression matrix."ls"
for least squares or "robust"
for robust regressionlm.series
, gls.series
or mrlm
MArrayLM
object containing the result of the fits.The rownames of object
are preserved in the fit object and can be retrieved by rownames(fit)
where fit
is output from lmFit
.
The column names of design
are preserved as column names and can be retrieved by colnames(fit)
.
lmscFit
.)
The coefficients of the fitted models describe the differences between the RNA sources hybridized to the arrays.
The probe-wise fitted model results are stored in a compact form suitable for further processing by other functions in the limma package.The function allows for missing values and accepts quantitative weights through the weights
argument.
It also supports two different correlation structures.
If block
is not NULL
then different arrays are assumed to be correlated.
If block
is NULL
and ndups
is greater than one then replicate spots on the same array are assumed to be correlated.
It is not possible at this time to fit models with both a block structure and a duplicate-spot correlation structure simultaneously.
If object
is a matrix then it should contain log-ratios or log-expression data with rows corresponding to probes and columns to arrays.
(A numeric vector is treated the same as a matrix with one column.)
For objects of other classes, a matrix of expression values is taken from the appropriate component or slot of the object.
If object
is of class MAList
or marrayNorm
, then the matrix of log-ratios (M-values) is extracted.
If object
is of class ExpressionSet
, then the expression matrix is extracted.
(This may contain log-expression or log-ratio values, depending on the platform.)
If object
is of class PLMset
then the matrix of chip coefficients chip.coefs
is extracted.
The arguments design
, ndups
, spacing
and weights
will be extracted from the data object
if available and do not normally need to set explicitly in the call.
On the other hand, if any of these are set in the function call then they will over-ride the slots or components in the data object
.
If object
is an PLMset
, then weights are computed as 1/pmax(object@se.chip.coefs, 1e-05)^2
.
If object
is an ExpressionSet
object, then weights are not computed.
If the argument block
is used, then it is assumed that ndups=1
.
The correlation
argument has a default value of 0.75
, but in normal use this default value should not be relied on and the correlation value should be estimated using the function duplicateCorrelation
.
The default value is likely to be too high in particular if used with the block
argument.
The actual linear model computations are done by passing the data to one the lower-level functions lm.series
, gls.series
or mrlm
.
The function mrlm
is used if method="robust"
.
If method="ls"
, then gls.series
is used if a correlation structure has been specified, i.e., if ndups>1
or block
is non-null and correlation
is different from zero.
If method="ls"
and there is no correlation structure, lm.series
is used.
lmFit
uses getEAWP
to extract expression values, gene annotation and so from the data object
.An overview of linear model functions in limma is given by 06.LinearModels.
# Simulate gene expression data for 100 probes and 6 microarrays
# Microarray are in two groups
# First two probes are differentially expressed in second group
# Std deviations vary between genes with prior df=4
sd <- 0.3*sqrt(4/rchisq(100,df=4))
y <- matrix(rnorm(100*6,sd=sd),100,6)
rownames(y) <- paste("Gene",1:100)
y[1:2,4:6] <- y[1:2,4:6] + 2
design <- cbind(Grp1=1,Grp2vs1=c(0,0,0,1,1,1))
options(digits=3)
# Ordinary fit
fit <- lmFit(y,design)
fit <- eBayes(fit)
topTable(fit,coef=2)
dim(fit)
colnames(fit)
rownames(fit)[1:10]
names(fit)
# Fold-change thresholding
fit2 <- treat(fit,lfc=0.1)
topTreat(fit2,coef=2)
# Volcano plot
volcanoplot(fit,coef=2,highlight=2)
# Mean-difference plot
plotMD(fit,column=2)
# Q-Q plot of moderated t-statistics
qqt(fit$t[,2],df=fit$df.residual+fit$df.prior)
abline(0,1)
# Various ways of writing results to file
## Not run: write.fit(fit,file="exampleresults.txt")
## Not run: write.table(fit,file="exampleresults2.txt")
# Fit with correlated arrays
# Suppose each pair of arrays is a block
block <- c(1,1,2,2,3,3)
dupcor <- duplicateCorrelation(y,design,block=block)
dupcor$consensus.correlation
fit3 <- lmFit(y,design,block=block,correlation=dupcor$consensus)
# Fit with duplicate probes
# Suppose two side-by-side duplicates of each gene
rownames(y) <- paste("Gene",rep(1:50,each=2))
dupcor <- duplicateCorrelation(y,design,ndups=2)
dupcor$consensus.correlation
fit4 <- lmFit(y,design,ndups=2,correlation=dupcor$consensus)
dim(fit4)
fit4 <- eBayes(fit4)
topTable(fit4,coef=2)
Run the code above in your browser using DataLab