Plot same writer and different writers reference similarity scores from a
validation set. The similarity scores are greater than or equal to zero and
less than or equal to one. The interval from 0 to 1 is split into n_bins.
The proportion of scores in each bin is calculated and plotted. Optionally, a
vertical dotted line may be plotted at an observed similarity score.
A dataframe of scores calculated with
get_ref_scores()
obs_score
Optional. A similarity score calculated with
calculate_slr()
n_bins
The number of bins
Details
The methods used in this package typically produce many times more different
writer scores than same writer scores. For example, ref_scores contains
79,600 different writer scores but only 200 same writer scores. Histograms,
which show the frequency of scores, don't handle this class imbalance well.
Instead, the rate of scores is plotted.