dif: Difference score

Description

Measures the difference score between two aligned amino acid or nucleotide sequences.

Usage

dif(seq1, seq2, gap = FALSE, aa.strict = FALSE)

Arguments

seq1

a character vector representing a first sequence.

seq2

a character vector representing a second sequence.

gap

a boolean indicating whether the gap character should be taken as a supplementary symbol (TRUE) or not (FALSE). Default is FALSE.

aa.strict

a boolean indicating whether only strict amino acids should be taken into account (TRUE) or not (FALSE). Default is FALSE.

Value

A single numeric value representing the difference score.

Details

The difference score between two aligned sequences is given by the proportion of sites that differs and is equivalent to $1 - {PID}$ (percent identity). dif is given by the number of aligned positions (sites) whose symbols differ, divided by the number of aligned positions. dif is equivalent to the p distance defined by Nei and Zhang (2006). In dif, positions with at least one gap can be excluded (gap = FALSE). When gaps are taken as a supplementary symbol (gap = TRUE), sites with gaps in both sequences are excluded.

From Nei and Zhang (2006), the p distance, which is the proportion of sites that differ between two sequences, is estimated by:

$${p} = \frac{n_d}{n},$$

where n is the number of sites and $n_d$ is the number of sites with different symbols.

The difference score ranges from 0, for identical sequences, to 1, for completely different sequences.

References

May AC (2004) Percent sequence identity: the need to be explicit. Structure 12:737-738.

Nei M and Zhang J (2006) Evolutionary Distance: Estimation. Encyclopedia of Life Sciences.

Nei M and Kumar S (2000) Molecular Evolution and Phylogenetics. Oxford University Press, New York.

Examples

Run this code

# NOT RUN {
# calculating the difference score between the sequences 
# of CLTR1_HUMAN and CLTR2_HUMAN:
aln <- import.fasta(system.file("msa/human_gpcr.fa", package = "bios2mds"))
dif <- dif(aln$CLTR1_HUMAN, aln$CLTR2_HUMAN)
dif
# }

Run the code above in your browser using DataLab