phyclust (version 0.1-24)

getcut.fun: Tzeng's Method: Finding the Best Number of Clusters

Description

For SNP sequences only, Tzeng's method (2005) uses an evolution approach to group haplotypes based on a deterministic transformation of haplotype frequency. This function find the best number of clusters based on Shannon information content.

Usage

getcut.fun(pp.org, nn, plot = 0)

Arguments

pp.org

frequency of haplotypes, sorted in decreasing order.

nn

number of haplotypes.

plot

illustrated in a plot.

Value

Return the best guess of number of clusters.

Details

pp.org is summarized from X in haplo.post.prob, nn is equal to the number of rows of X.

This function is called by haplo.post.prob to determine the best guess of number of clusters. See Tzeng (2005) and Shannon (1948) for details.

References

Phylogenetic Clustering Website: https://snoweye.github.io/phyclust/

Tzeng, J.Y. (2005) “Evolutionary-Based Grouping of Haplotypes in Association Analysis”, Genetics Epidemiology, 28, 220-231. http://www4.stat.ncsu.edu/~jytzeng/software.php

Shannon, C.E. (1948) “A mathematical theory of communication”, Bell System Tech J, 27, 379-423, 623-656.

See Also

haplo.post.prob.

Examples

Run this code
# NOT RUN {
library(phyclust, quiet = TRUE)

data.path <- paste(.libPaths()[1], "/phyclust/data/crohn.phy", sep = "")
my.snp <- read.phylip(data.path, code.type = "SNP")
ret <- haplo.post.prob(my.snp$org, ploidy = 1)
getcut.fun(sort(ret$haplo$hap.prob, decreasing = TRUE),
           nn = my.snp$nseq, plot = 1)
# }

Run the code above in your browser using DataLab