Learn R Programming

nametagger (version 0.1.7)

write_nametagger: Save a tokenised dataset as nametagger train data

Description

Save a tokenised dataset as nametagger train data

Usage

write_nametagger(x, file = tempfile(fileext = ".txt", pattern = "nametagger_"))

Value

invisibly an object of class nametagger_traindata which is a list with elements

  • data: a character vector of text in the nametagger format

  • file: the path to the file where data is saved to

Arguments

x

a tokenised data.frame with columns doc_id, sentence_id, token containing 1 row per token.
In addition it can have columns lemma and pos representing the lemma and the parts-of-speech tag of the token

file

the path to the file where the training data will be saved

Examples

Run this code
data(europeananews)
x <- subset(europeananews, doc_id %in% "enp_NL.kb.bio")
x <- head(x, n = 250)

path <- "traindata.txt" 
# \dontshow{
path <- tempfile("traindata_", fileext = ".txt")
# }
bio  <- write_nametagger(x, file = path)
str(bio)

# \dontshow{
# clean up for CRAN
file.remove(path)
# }

Run the code above in your browser using DataLab