CSanalysis,matrix,matrix,CSsmfa-method: "CSsmfa"

Description

Doing interactive CS analysis with sMFA (Sparse Multiple Factor Analysis). Should use multiple queries for this analysis. Either spca or arrayspc is used.

Usage

# S4 method for matrix,matrix,CSsmfa
CSanalysis(querMat, refMat, type = "Csmfa",
  K = 15, para, lambda = 1e-06, sparse.dim = 2, sparse = "penalty",
  max.iter = 200, eps.conv = 0.001, which = c(2, 3, 4, 5),
  component.plot = NULL, CSrank.queryplot = FALSE, column.interest = NULL,
  row.interest = NULL, profile.type = "gene", color.columns = NULL,
  gene.highlight = NULL, gene.thresP = 1, gene.thresN = -1,
  thresP.col = "blue", thresN.col = "red", grouploadings.labels = NULL,
  grouploadings.cutoff = NULL, legend.names = NULL, legend.cols = NULL,
  legend.pos = "topright", labels = TRUE, result.available = NULL,
  result.available.update = FALSE, plot.type = "device",
  basefilename = NULL)

Arguments

querMat

Query matrix (Rows = genes and columns = compounds)

refMat

Reference matrix

type

"CSsmfa"

sMFA Parameters: Number of components.

para

sMFA Parameters: A vector of length K. All elements should be positive. If sparse="varnum", the elements integers.

lambda

sMFA Parameters: Quadratic penalty parameter. Default value is 1e-6. If the target dimension of the sparsness is higher than the other dimension (p > n), it is advised to put lambda to Inf which uses the arrayspc algorithm optimized for this case. For the other case, p < n, a zero or positive lambda is sufficient and will utilize the normal spca algorithm.

sparse.dim

sMFA Parameters: Which dimension should be sparse? 1: Rows, 2: Columns (default) (Note: For Connectivity Scores it is advised to apply sparsity on the compounds/columns)

sparse

sMFA Parameters (lambda < Inf only): If sparse="penalty", para is a vector of 1-norm penalty parameters. If sparse="varnum", para defines the number of sparse loadings to be obtained.

max.iter

sMFA Parameters: Maximum number of iterations.

eps.conv

sMFA Parameters: Convergence criterion.

which

Choose one or more plots to draw:

Information Content for Bicluster (Only available for "CSfabia")
Loadings for query compounds
Loadings for Component (Factor/Bicluster) component.plot
Gene Scores for Component (Factor/Bicluster) component.Plot
Connectivity Ranking Scores for Component component.plot
Component component.plot VS Other Component : Loadings & Genes
Profile plot (see profile.type)
Group Loadings Plots for all components (see grouploadings.labels).

component.plot

Which components (Factor/Bicluster) should be investigated? Can be a vector of multiple (e.g. c(1,3,5)). If NULL, you can choose components of interest interactively from query loadings plot.

CSrank.queryplot

Logical value deciding if the CS Rank Scores (which=5) should also be plotted per query (instead of only the weighted mean).

column.interest

Numeric vector of indices of reference columns which should be in the profiles plots (which=7). If NULL, you can interactively select genes on the Compound Loadings plot (which=3).

row.interest

Numeric vector of gene indices to be plotted in gene profiles plot (which=7, profile.type="gene"). If NULL, you can interactively select them in the gene scores plot (which=4).

profile.type

Type of which=7 plot:

"gene": Gene profiles plot of selected genes in row.interest with the query compounds and those selected in column.interest ordered first on the x axis. The other compounds are ordered in decreasing CScore.
"cmpd": Compound profiles plot of query and selected compounds (column.interest) and only those genes on the x-axis which beat the thresholds (gene.thresP, gene.thresN)

color.columns

Vector of colors for the query and reference columns (compounds). If NULL, blue will be used for query and black for reference. Use this option to highlight query columns and reference columns of interest.

gene.highlight

Single numeric vector or list of maximum 5 numeric vectors. This highlights gene of interest in gene scores plot (which=4) up to 5 different colors. (e.g. You can use this to highlight genes you know to be differentially expressed)

gene.thresP

Threshold for genes with a high score (which=4).

gene.thresN

Threshold for genes with a low score (which=4).

thresP.col

Color of genes above gene.thresP.

thresN.col

Color of genes below gene.thresN.

grouploadings.labels

This parameter used for the Group Loadings Plots (which=8). In general this plot will contain the loadings of all factors, grouped and colored by the labels given in this parameter.

If grouploadings.labels!=NULL: Provide a vector for all samples (query + ref) containing labels on which the plot will be based on.
If grouploadings.labels=NULL: If no labels are provided when choosing which=8, automatic labels ("Top Samples of Component 1, 2....") will be created. These labels are given to the top grouploadings.cutoff number of samples based on the absolute values of the loadings.

Plot which=8 can be used to check 2 different situations. Either to check if your provided labels coincide with the discovered structure in the analysis. The other aim is to find new interesting structures (of samples) which strongly appear in one or multiple components. A subsequent step could be to take some strong samples/compounds of these compounds and use them as a new query set in a new CS analysis to check its validity or to find newly connected compounds.

Please note that even when group.loadings.labels!=NULL, that the labels based on the absolute loadings of all the factors (the top grouploadings.cutoff) will always be generated and saved in samplefactorlabels in the extra slot of the CSresult object. This can then later be used for the CSlabelscompare function to compare them with your true labels.

grouploadings.cutoff

Parameter used in plot which=8. See grouploadings.labels=NULL for more information. If this parameter is not provided, it will be automatically set to 10% of the total number of loadings.

legend.names

Option to draw a legend of for example colored columns in Compound Loadings plot (which=3). If NULL, only "References" will be in the legend.

legend.cols

Colors to be used in legends. If NULL, only blue for "Queries is used".

legend.pos

Position of the legend in all requested plots, can be "topright", "topleft", "bottomleft", "bottomright", "bottom", "top", "left", "right", "center".

labels

Boolean value (default=TRUE) to use row and/or column text labels in the score plots (which=c(3,4,5,6)).

result.available

You can a previously returned object by CSanalysis in order to only draw graphs, not recompute the scores.

result.available.update

Logical value. If TRUE, the CS and GS will be overwritten depending on the new component.plot choice. This would also delete the p-values if permutation.object was available.

plot.type

How should the plots be outputted? "pdf" to save them in pdf files, device to draw them in a graphics device (default), sweave to use them in a sweave or knitr file.

basefilename

Directory including filename of the graphs if saved in pdf files

Value

An object of the S4 Class CSresult-class.