Learn R Programming

CORM (version 1.0.2)

fit.CLMM.2: Clustering of Linear Mixed Models Method

Description

Fit a CLMM model for replicated time course data allowing for two sets of time points among biological or technical replicates. Missing value are allowed.

Usage

fit.CLMM.2(data.y1,data.x1,data.z1,data.y2,data.x2,data.z2,n.clst, n.run = 1)

Arguments

data.y1
a three dimensional array of gene expression data, data.y1[j, i, t] for gene j, sample i, and time point t. Missing values should be indicated by "NA". And at least one case not missing in each pair of observations.
data.x1
a three dimensional array of fixed effects (common for all genes), data.x1[i, t, p] for sample i, time point t, and covariate p.
data.z1
a three dimensional array of random effects (common for all genes), data.z1[i, t, q] for sample i, time point t, and covariate p.
data.y2
a three dimensional array of gene expression data, data.y2[j, i, t] for gene j, sample i, and time point t. Missing values should be indicated by "NA". And at least one case not missing in each pair of observations.
data.x2
a three dimensional array of fixed effects (common for all genes), data.x2[i, t, p] for sample i, time point t, and covariate p.
data.z2
a three dimensional array of random effects (common for all genes), data.z2[i, t, q] for sample i, time point t, and covariate p.
n.clst
an integer, number of clusters.
n.run
an integer used to get the starting value for the EM algorithm.

Value

u.hat
a matrix containing the cluster membership for the genes.
b.hat
an array containing the estimated random effects.
theta.hat
a list comprised of five components: zeta.hat, pi.hat, D.hat, sigma2.hat and llh. They are described as below:
zeta.hat
a matrix of the estimated fixed effects with one row for each cluster.
pi.hat
a vector of the relative frequency for each cluster.
D.hat
the estimated random effects variances for the clusters.
sigma2.hat
the estimated measurement error variances for the clusters.
llh
log likelihood for the model.

Details

This function implements the Clustering of Linear Mixed Models Method of Qin and Self (2006).

References

  • Li-Xuan Qin and Steven G. Self (2006). The clustering of regression models method with applications in gene expression data. Biometrics 62, 526-533.
  • Li-Xuan Qin, Linda Breeden and Steven G. Self (2014). Finding gene clusters for a replicated time course study. BMC Res Notes 7:60.

See Also

fit.CLM, fit.CLMM, fit.CLMM.2

Examples

Run this code
## Not run: 
# #test data
#  data(YeastCellCycle2)
#  data.y1 <- YeastCellCycle2$normalizedData.WT
#  data.x1 <- YeastCellCycle2$designMatrix.WT
#  data.y2 <- YeastCellCycle2$normalizedData.SM
#  data.x2 <- YeastCellCycle2$designMatrix.SM
#  n.clst  <- 8
#  fit1	<- fit.CLMM.2(data.y1=data.y1,data.x1=data.x1,data.z1=data.x1,
#                       data.y2=data.y2,data.x2=data.x2,data.z2=data.x2,
# 					  n.clst=n.clst)
#  fit1.u  <- apply(fit1$u.hat, MARGIN=1, FUN=order, decreasing=TRUE)[1,]
#  zeta.fitted <- fit1$theta.hat$zeta.hat
#  profile.WT  <- YeastCellCycle2$designMatrix.WT[1,,] %*% t(zeta.fitted)
#  profile.SM  <- YeastCellCycle2$designMatrix.SM[1,,] %*% t(zeta.fitted)
# # remove bad time points for WTs and SMs
#  n.time  <- 25
#  time.WT <- (1:n.time)[-22]
#  time.SM <- (1:n.time)[-c(6,9,12)]
#  n.rpl.WT<- length(time.WT)
#  n.rpl.SM<- length(time.SM)
#  n.gene.short<-dim(YeastCellCycle2$normalizedData.WT)[1]
# # gene-specific profile: observed profiles averaged over replicates
#  data.WT.mean  <- matrix(0, nrow=n.gene.short, ncol=n.rpl.WT)
#  data.SM.mean	<- matrix(0, nrow=n.gene.short, ncol=n.rpl.SM)
#  for(j in 1:n.gene.short){
#   data.WT.mean[j,] <- (YeastCellCycle2$normalizedData.WT[j,1,]+
#                        YeastCellCycle2$normalizedData.WT[j,2,])/2
#   data.SM.mean[j,] <- (YeastCellCycle2$normalizedData.SM[j,1,]+
#                        YeastCellCycle2$normalizedData.SM[j,2,])/2
#  }
# # plot observed profiles by cluster -- wild type
#  col.panel=rainbow(8)
#  par(mai=c(0.3,0.25,0.2,0.05),mfrow=c(3,3))
#  for(k in 1:n.clst){
#   plot(5*(time.WT-1), profile.WT[,k], type="l", col=col.panel[k], ylim=c(-2,2),
#        xlab="Time", ylab="Expression Value", main=paste("WT: cluster",k))
#   i.plot <- (1:n.gene.short)[fit1.u==k]
#   for(j in i.plot) lines(5*(time.WT-1), data.WT.mean[j,], lty=1)
#   lines(5*(time.WT-1), profile.WT[,k], col=col.panel[k], lwd=2)
#   text(125, -1.9, pos=2, paste(length(i.plot)," genes"))
# }
# # plot observed profiles by cluster -- single mutant
#  par(mai=c(0.3,0.25,0.2,0.05),mfrow=c(3,3))
#  for(k in 1:n.clst){
#   plot(5*(time.SM-1), profile.SM[,k], type="l", col=col.panel[k], ylim=c(-2,2),
#        xlab="Time", ylab="Expression Value", main=paste("SM: cluster",k))
#   i.plot <- (1:n.gene.short)[fit1.u==k]
#   for(j in i.plot) lines(5*(time.SM-1), data.SM.mean[j,], lty=1)
#   lines(5*(time.SM-1), profile.SM[,k], col=col.panel[k], lwd=2)
#   text(125, -1.9, pos=2, paste(length(i.plot)," genes"))
# }
# # plot fitted profiles by cluster
#  par(mai=c(0.3,0.25,0.2,0.05),mfrow=c(3,3))
#  for(k in 1:n.clst){ 
#   plot(5*(time.WT-1), profile.WT[,k], type="l", ylim=c(-2,2), 
#        xlab="Time", ylab="Expression Value", lwd=2)
#   title(paste("Cluster", k))
#   lines(5*(time.SM-1), profile.SM[,k], lty=3, lwd=2)
#   if(k==1) legend(60, 2, c("WT", "SM"), lty=1:2, cex=0.8)
#  }
# ## End(Not run)

Run the code above in your browser using DataLab