Learn R Programming

fabia (version 2.18.0)

extractBic: Extraction of Biclusters

Description

extractBic: Rimplementation of extractBic.

Usage

extractBic(fact,thresZ=0.5,thresL=NULL)

Arguments

fact
object of the class Factorization.
thresZ
threshold for sample belonging to bicluster; default 0.5.
thresL
threshold for loading belonging to bicluster (if not given it is estimated).

Value

  • bicextracted biclusters.
  • numnindexes for the extracted biclusters.
  • bicoppextracted opposite biclusters.
  • numnoppindexes for the extracted opposite biclusters.
  • Xscaled and centered data matrix.
  • npnumber of biclusters.

concept

  • biclustering
  • sparse coding
  • sparse matrix factorization
  • non-negative matrix factorization

Details

Essentially the model is the sum of outer products of vectors: $$X = \sum_{i=1}^{p} \lambda_i z_i^T + U$$ where the number of summands $p$ is the number of biclusters. The matrix factorization is $$X = L Z + U$$

Here $\lambda_i$ are from $R^n$, $z_i$ from $R^l$, $L$ from $R^{n \times p}$, $Z$ from $R^{p \times l}$, and $X$, $U$ from $R^{n \times l}$.

$U$ is the Gaussian noise with a diagonal covariance matrix which entries are given by Psi. The $Z$ is locally approximated by a Gaussian with inverse variance given by lapla. Using these values we can computer for each $j$ the variance $z_i$ given $x_j$. Here

$$x_j = L z_j + u_j$$

This variance can be used to determine the information content of a bicluster. The $\lambda_i$ and $z_i$ are used to extract the bicluster $i$, where a threshold determines which observations and which samples belong the the bicluster.

In bic the biclusters are extracted according to the largest absolute values of the component $i$, i.e. the largest values of $\lambda_i$ and the largest values of $z_i$. The factors $z_i$ are normalized to variance 1.

The components of bic are binp, bixv, bixn, biypv, and biypn.

binp give the size of the bicluster: number observations and number samples. bixv gives the values of the extracted observations that have absolute values above a threshold. They are sorted. bixn gives the extracted observation names (e.g. gene names). biypv gives the values of the extracted samples that have absolute values above a threshold. They are sorted. biypn gives the names of the extracted samples (e.g. sample names).

In bicopp the opposite clusters to the biclusters are given. Opposite means that the negative pattern is present. The components of opposite clusters bicopp are binn, bixv, bixn, biypnv, and biynn. binp give the size of the opposite bicluster: number observations and number samples. bixv gives the values of the extracted observations that have absolute values above a threshold. They are sorted. bixn gives the extracted observation names (e.g. gene names). biynv gives the values of the opposite extracted samples that have absolute values above a threshold. They are sorted. biynn gives the names of the opposite extracted samples (e.g. sample names). That means the samples are divided into two groups where one group shows large positive values and the other group has negative values with large absolute values. That means a observation pattern can be switched on or switched off relative to the average value.

numn gives the indices of bic with components: numng = bix and numnp = biypn. numn gives the indices of bicopp with components: numng = bix and numnn = biynn.

Implementation in R.

See Also

fabia, fabias, fabiap, fabi, fabiasp, mfsc, nmfdiv, nmfeu, nmfsc, extractPlot, extractBic, plotBicluster, Factorization, projFuncPos, projFunc, estimateMode, makeFabiaData, makeFabiaDataBlocks, makeFabiaDataPos, makeFabiaDataBlocksPos, matrixImagePlot, fabiaDemo, fabiaVersion

Examples

Run this code
#---------------
# TEST
#---------------

dat <- makeFabiaDataBlocks(n = 100,l= 50,p = 3,f1 = 5,f2 = 5,
  of1 = 5,of2 = 10,sd_noise = 3.0,sd_z_noise = 0.2,mean_z = 2.0,
  sd_z = 1.0,sd_l_noise = 0.2,mean_l = 3.0,sd_l = 1.0)

X <- dat[[1]]
Y <- dat[[2]]


resEx <- fabia(X,3,0.01,20)

rEx <- extractBic(resEx)

rEx$bic[1,]
rEx$bic[2,]
rEx$bic[3,]


#---------------
# DEMO1
#---------------

dat <- makeFabiaDataBlocks(n = 1000,l= 100,p = 10,f1 = 5,f2 = 5,
  of1 = 5,of2 = 10,sd_noise = 3.0,sd_z_noise = 0.2,mean_z = 2.0,
  sd_z = 1.0,sd_l_noise = 0.2,mean_l = 3.0,sd_l = 1.0)

X <- dat[[1]]
Y <- dat[[2]]


resToy <- fabia(X,13,0.01,200)

rToy <- extractBic(resToy)

avini(resToy)

rToy$bic[1,]
rToy$bic[2,]
rToy$bic[3,]

#---------------
# DEMO2
#---------------


avail <- require(fabiaData)

if (!avail) {
    message("")
    message("")
    message("#####################################################")
    message("Package 'fabiaData' is not available: please install.")
    message("#####################################################")
} else {

data(Breast_A)

X <- as.matrix(XBreast)

resBreast <- fabia(X,5,0.1,200)

rBreast <- extractBic(resBreast)

avini(resBreast)

rBreast$bic[1,]
rBreast$bic[2,]
rBreast$bic[3,]
}

Run the code above in your browser using DataLab