textcat(x, p = ECIMCI_profiles, method = "CT")
as.character
.textcat_profile_db
). For each given text, its n-gram profile is computed using the options
in the reference profile db. Then, the distance between the profile
and the reference profiles is computed, and the text is categorized
into the category of the closest profile (if this is not unique,
NA
is obtained).
Unless the profile db uses bytes rather than characters, the texts in
x
should be encoded in UTF-8.
textcat(c("This is an english sentence.",
"Das ist ein deutscher satz."))
Run the code above in your browser using DataLab