Learn R Programming

BioPhysConnectoR (version 1.6-10)

get.entropy: Compute the Sequence Entropy for an Alignment

Description

Computes the sequence entropy of an alignment. It is possible to specify which characters to omit in the computation. The joint entropy is computed using get.entropy2p().

Usage

get.entropy(aln, bool = FALSE, gapchar = "NOGAPCHAR",
            verbose = FALSE)

get.entropy2p(aln, bool = FALSE, gapchar = "NOGAPCHAR", verbose = FALSE)

Arguments

aln
alignment matrix
bool
logical, if TRUE gaps are ignored when computing the entropy of each column of the alignment
gapchar
character vector containing the unique set of characters representing gaps in the amino acid sequence
verbose
logical, TRUE for getting output messages

Value

  • Return value for get.entropy() is a vector containing the entropy for each column. Return value for get.entropy2p() is a matrix containing the joint entropies in the lower triangle.

Details

The Shannon (1948) entropy for an alignment is computed as follows: $$H(X)=-\sum_{x\in X} p(x)\cdot\log_2(p(x))$$ The joint entropy is computed for every possible column pair: $$H(X,Y)=-\sum_{x\in X}\sum_{ y\in Y} p(x,y)\cdot\log_2(p(x,y))$$ where $X$ and $Y$ are two columns of the alignment.

References

Shannon (1948) The Bell System Technical Journal 27, 379--423.

See Also

get.mie

Examples

Run this code
aln<-matrix(c("M", "H", "X", "P", "V", "-", "H", "X", "L", "V", "M", "L",
 "X", "P", "V"), 3, byrow = TRUE)
h1<-get.entropy(aln, bool = TRUE , gapchar = "-")
h2<-get.entropy(aln)

h3<-get.entropy2p(aln)

Run the code above in your browser using DataLab