get.entropy: Compute the Sequence Entropy for an Alignment

Description

Computes the sequence entropy of an alignment. It is possible to specify which characters to omit in the computation. The joint entropy is computed using get.entropy2p().

Usage

get.entropy(aln, bool = FALSE, gapchar = "NOGAPCHAR",
            verbose = FALSE)
get.entropy2p(aln, bool = FALSE, gapchar = "NOGAPCHAR",
              verbose = FALSE)

Arguments

aln

alignment matrix

bool

logical, if TRUE gaps are ignored when computing the entropy of each column of the alignment

gapchar

character vector containing the unique set of characters representing gaps in the amino acid sequence

verbose

logical, TRUE for getting output messages

Value

Return value for get.entropy() is a vector containing the entropy for each column. Return value for get.entropy2p() is a matrix containing the joint entropies in the lower triangle.

Details

The Shannon (1948) entropy for an alignment is computed as follows: $$H(X)=-\sum_{x\in X} p(x)\cdot\log_2(p(x))$$ The joint entropy is computed for every possible column pair: $$H(X,Y)=-\sum_{x\in X}\sum_{ y\in Y} p(x,y)\cdot\log_2(p(x,y))$$ where $X$ and $Y$ are two columns of the alignment.

References

Shannon (1948) The Bell System Technical Journal 27, 379--423.

Examples

Run this code

aln<-matrix(c("M", "H", "X", "P", "V", "-", "H", "X", "L", "V", "M", "L",
 "X", "P", "V"), 3, byrow = TRUE)
h1<-get.entropy(aln, bool = TRUE , gapchar = "-")
h2<-get.entropy(aln)

h3<-get.entropy2p(aln)

Run the code above in your browser using DataLab