phyclust (version 0.1-34)

getcut.fun: Tzeng's Method: Finding the Best Number of Clusters

Description

For SNP sequences only, Tzeng's method (2005) uses an evolution approach to group haplotypes based on a deterministic transformation of haplotype frequency. This function find the best number of clusters based on Shannon information content.

Usage

getcut.fun(pp.org, nn, plot = 0)

Value

Return the best guess of number of clusters.

Arguments

pp.org

frequency of haplotypes, sorted in decreasing order.

nn

number of haplotypes.

plot

illustrated in a plot.

Author

Jung-Ying Tzeng.

Maintain: Wei-Chen Chen wccsnow@gmail.com

Details

pp.org is summarized from X in haplo.post.prob, nn is equal to the number of rows of X.

This function is called by haplo.post.prob to determine the best guess of number of clusters. See Tzeng (2005) and Shannon (1948) for details.

References

Phylogenetic Clustering Website: https://snoweye.github.io/phyclust/

Tzeng, J.Y. (2005) “Evolutionary-Based Grouping of Haplotypes in Association Analysis”, Genetics Epidemiology, 28, 220-231. https://www4.stat.ncsu.edu/~jytzeng/software.php

Shannon, C.E. (1948) “A mathematical theory of communication”, Bell System Tech J, 27, 379-423, 623-656.

See Also

haplo.post.prob.

Examples

Run this code
if (FALSE) {
library(phyclust, quiet = TRUE)

data.path <- paste(.libPaths()[1], "/phyclust/data/crohn.phy", sep = "")
my.snp <- read.phylip(data.path, code.type = "SNP")
ret <- haplo.post.prob(my.snp$org, ploidy = 1)
getcut.fun(sort(ret$haplo$hap.prob, decreasing = TRUE),
           nn = my.snp$nseq, plot = 1)
}

Run the code above in your browser using DataLab