udpipe (version 0.3)

udpipe_accuracy: Evaluate the accuracy of your UDPipe model on holdout data

Description

Get precision, recall and F1 measures on finding words / sentences / upos / xpos / features annotation as well as UAS and LAS dependency scores on holdout data in conllu format.

Usage

udpipe_accuracy(object, file_conllu, tokenizer = c("default", "none"),
  tagger = c("default", "none"), parser = c("default", "none"))

Arguments

object

an object of class udpipe_model as returned by udpipe_load_model

file_conllu

the full path to a file on disk containing holdout data in conllu format

tokenizer

a character string of length 1, which is either 'default' or 'none'

tagger

a character string of length 1, which is either 'default' or 'none'

parser

a character string of length 1, which is either 'default' or 'none'

Value

a list with 3 elements

  • accuracy: A character vector with accuracy metrics.

  • error: A character string with possible errors when calculating the accuracy metrics

References

https://ufal.mff.cuni.cz/udpipe, http://universaldependencies.org/format.html

See Also

udpipe_load_model

Examples

Run this code
# NOT RUN {
x <- udpipe_download_model(language = "dutch-lassysmall")
ud_dutch <- udpipe_load_model(x$file_model)

file_conllu <- system.file(package = "udpipe", "dummydata", "traindata.conllu")
metrics <- udpipe_accuracy(ud_dutch, file_conllu)
metrics$accuracy
metrics <- udpipe_accuracy(ud_dutch, file_conllu, 
                           tokenizer = "none", tagger = "default", parser = "default")
metrics$accuracy
metrics <- udpipe_accuracy(ud_dutch, file_conllu, 
                           tokenizer = "none", tagger = "none", parser = "default")
metrics$accuracy
metrics <- udpipe_accuracy(ud_dutch, file_conllu, 
                           tokenizer = "default", tagger = "none", parser = "none")
metrics$accuracy

## cleanup for CRAN only - you probably want to keep your model if you have downloaded it
file.remove("dutch-lassysmall-ud-2.0-170801.udpipe")
# }

Run the code above in your browser using DataLab