The functions requires that the factors have exactly the same levels.
For two class problems, the sensitivity, specificity, positive predictive
value and negative predictive value is calculated using the positive
argument. Also, the prevalence of the "event" is computed from the data
(unless passed in as an argument), the detection rate (the rate of true
events also predicted to be events) and the detection prevalence (the
prevalence of predicted events).
Suppose a 2x2 table with notation

Reference 

Predicted 
Event 
No Event

Event 
A 
B 
No Event 
C 
D 
The formulas used here are: $$Sensitivity = A/(A+C)$$ $$Specificity =
D/(B+D)$$ $$Prevalence = (A+C)/(A+B+C+D)$$ $$PPV = (sensitivity *
prevalence)/((sensitivity*prevalence) + ((1specificity)*(1prevalence)))$$
$$NPV = (specificity * (1prevalence))/(((1sensitivity)*prevalence) +
((specificity)*(1prevalence)))$$ $$Detection Rate = A/(A+B+C+D)$$
$$Detection Prevalence = (A+B)/(A+B+C+D)$$ $$Balanced Accuracy =
(sensitivity+specificity)/2$$
$$Precision = A/(A+B)$$ $$Recall = A/(A+C)$$ $$F1 =
(1+beta^2)*precision*recall/((beta^2 * precision)+recall)$$
where beta = 1
for this function.
See the references for discussions of the first five formulas.
For more than two classes, these results are calculated comparing each
factor level to the remaining levels (i.e. a "one versus all" approach).
The overall accuracy and unweighted Kappa statistic are calculated. A
pvalue from McNemar's test is also computed using
mcnemar.test
(which can produce NA
values with
sparse tables).
The overall accuracy rate is computed along with a 95 percent confidence
interval for this rate (using binom.test
) and a
onesided test to see if the accuracy is better than the "no information
rate," which is taken to be the largest class percentage in the data.