Learn R Programming

rKOMICS (version 1.3)

msc.matrix: Build cluster matrix

Description

The msc.matrix function reads the output of clustering analyses (UC file) for specified minimum percent identity (MPI) values and organizes the data into a matrix format. This matrix represents the presence or absence of Minicircle Sequence Classes (MSCs) in each sample. The resulting matrix simplifies downstream analyses and visualizations by eliminating the need for manual data manipulation and reformatting.

Usage

msc.matrix(files, samples, groups)

Value

a a list that contains one cluster matrix per percent identity. Each matrix represents the presence or absence of MSCs in each sample. In the cluster matrix, a value of 0 indicates that the MSC is not present in the sample, while a value higher than 0 indicates that the MSC is found at least once in the sample.

Arguments

files

a character vector containing the names of the UC files generated by the VSEARCH tool. Each file represents the output of clustering analysis for a specific minimum percent identity (MPI), such as all.minicircles.circ.id70.uc, all.minicircles.circ.id80.uc, and so on. Please ensure that your file names end with 'idxx.uc' for this function to work properly.

samples

a character vector containing the sample names.

groups

a vector of the same length as the samples, specifying the groups (e.g., species) to which the samples belong.

Examples

Run this code
data(exData)

### run function
# \donttest{
matrices <- msc.matrix(files = system.file("extdata", exData$ucs, package="rKOMICS"), 
                      samples = exData$samples, 
                      groups = exData$species)
# }

### or: 
data(matrices)

### show matrix with id 95%
matrices[["id95"]]
rowSums(matrices[["id95"]]) # --> frequency of MSC across all samples
colSums(matrices[["id95"]]) # --> number of MSC per sample

Run the code above in your browser using DataLab