findTrios uses K-means clustering to identify reaction groups/clusters and the GAP statistic by Ryan Tibshirani to identity the best k implemented in the library cluster Clustering algorithm uses the input matrix cmoposed of the the KS statistics of the KOs flanking the KO of interest.
findTrios(KOI, ks, toPrint = TRUE, outputFile, plotDir)
KOs-of-interest ie KOs above a selected amount of expression ie. highly expressed KOs and with D-statistics above that of the desired threshold
Precalculated data.frame output from ksCal fxn containing all KO's gene distribution KS statistic when compared with whole sample's empirical gene distribution
conditional to print the results of the classification plots
the txt file to write the results of findTrios to
the location to save the diagnostics plots to
data.frame of all reactions, corresponding clusters and the KS statistics for each KO and the selected Cluster
The KS statistics is calculated based on the comparison of each KO's gene/contig expression distribution against the 'null' ie. empirical distribution against all genes in all contigs
In addition it also identifies reactions where flanking KOs have high gene diversity, given by low KS (<= 0.5)