Learn R Programming

GrammR (version 1.1.0)

GrammRServ: Graphical Representation without a GUI

Description

A non-GUI method to construct graphical representations of metagenomic count data. This function is recommended for large data sets and can be run as a background job when a user-interface is not available.

Usage

GrammRServ(Data = NULL, Cluster = NULL, DataType = "Counts", DistType = "Kendall's tau-distance", PhyTree = NULL, GunifType = NULL, GunifWeight = 0, Dim = c(2, 3, 4), LpNorm = c(1), Penalty = 0.5, MinClust = 2)

Arguments

Data
Data matrix consisting of one of the following two values:
  • (1) metagenomic counts with the rows of the matrix representing attributes to be clustered(can be samples or taxa).
  • (2) measure of dissimilarity between samples or taxa.

Cluster
(Optional) The vector whose length is equal to the number of rows of Data. Values in the vector provide the cluster membership of samples determined using their attributes.
DataType
A character variable corresponding to the type of values in Data. It takes values in c(“Counts”, “Distance”)
DistType
Measure of dissimilarity between samples to be used to calculate the distance matrix. It takes values in c(“Kendall's tau-distance”, “UniFrac”) and is used when the DataType is equal to Counts. The default value is “Kendall's tau-distance”.
PhyTree
A phylogenetic tree of class phylo to be used for calculating the UniFrac distance. This is to be provided only when DistType is set equal to “UniFrac”.
GunifType
The type of UniFrac distance to be specified when calculating the UniFrac distance using GUniFrac package. It takes values in c(“Unweighted”, “Variance Adjusted”, “Generalized”).
GunifWeight
The weight parameter used in calculation of Generalized UniFrac distance. The parameter takes values between 0 and 1. For more details, see Chen et.al.(2012).
Dim
Dimension of the multidimensional scaling model to be constructed. Default value is c(2,3,4).
LpNorm
A vector valued variable which determines the norm to be used in multidimensional scaling model calculation. The default value (equal to 1) corresponds to $l_1$-MDS model. Principal coordinate analysis (PCoA) is performed when the value is set to two.
Penalty
A numeric value between 0 and 1 which is used as penalty for ties in calculation of Kendall's $tau$-distance. Default value is 0.5.
MinClust
Minimum number of clusters to be used in PAM method for estimating the optimal number of clusters. Default value is 2.

Value

Separate directories are created in the current working directory for each model constructed using all possible combinations of dimension and $l_p$ norm specified.
  1. Directories for the two dimensional models contain the average silhouette plot, true estimated model, model showing estimated clusters and (optional)model showing true clusters.
  2. Directories for models of dimension greater than two contain the average silhoutte plot and subdirectories for the true model, estimated clusters model and (optional)model showing true clusters.
For all models, a text file containing the estimated cluster membership is saved in the subdirectory corresponding to the model for future validation.

References

Chen, J., et.al. (2012) Associating microbiome composition with environmental covariates using generalized UniFrac distance, Bioinformatics, 28(16).

See Also

GrammRGUI

Examples

Run this code
data(metagencounts)
GrammRServ(Data = metagencounts$Counts, Cluster = metagencounts$CommMemshp, 
DataType = "Counts", DistType = "Kendall's tau-distance", 
Dim = c(2, 3, 4), LpNorm = c(1,2), Penalty = 0.5, MinClust = 2)

Run the code above in your browser using DataLab