The difference score between two aligned sequences is given by the proportion of sites that differs and is equivalent to \(1 - {PID}\) (percent identity).
dif
is given by the number of aligned positions (sites) whose symbols differ, divided by the number of aligned positions. dif
is equivalent to the p distance defined by Nei and Zhang (2006).
In dif
, positions with at least one gap can be excluded (gap = FALSE). When gaps are taken as a supplementary symbol (gap = TRUE), sites with gaps in both sequences are excluded.
From Nei and Zhang (2006), the p distance, which is the proportion of sites that differ between
two sequences, is estimated by:
$${p} = \frac{n_d}{n},$$
where n is the number of sites and \(n_d\) is the number of sites with different symbols.
The difference score ranges from 0, for identical sequences, to 1, for completely
different sequences.