A subset of labels used in the catalogue of the DNB along with their
frequencies of occurrence. The label_ids match those in the
dnb_gold_standard and dnb_test_predictions datasets.
dnb_label_distributiondnb_label_distribution
A data frame with 7,772 rows and 3 columns:
label_idDNB identifier of a concept in the GND subject vocabulary.
label_freqNumber of occurences of the specified label in the overall catalogue.
n_docsOverall number of documents in the ground truth dataset.