The seqient function returns the Shannon entropy of each sequence in seqdata
. The entropy of a sequence is computed using the formula$$h(\pi_1,\ldots,\pi_s)=-\sum_{i=1}^{s}\pi_i\log \pi_i$$
where $s$ is the size of the alphabet and $p_i$ the proportion of occurrences of the $i$th state in the considered sequence. The log is here the natural logarithm, i.e., the logarithm in base $e$. The entropy can be interpreted as the `uncertainty' of predicting the states in a given sequence. If all states in the sequence are the same, the entropy is equal to 0. The maximum entropy for a sequence of length 12 with an alphabet of 4 states is 1.386294 and is attained when each of the four states appears 3 times.
Normalization can be requested with the norm=TRUE
option, in which case the returned value is the entropy divided by the entropy of the alphabet. The later is an upper bound for the entropy of sequences made from this alphabet. It exactly is the maximal entropy when the sequence length is a multiple of the alphabet size. The value of the normalized entropy is independent of the chosen logarithm base.