Many, if not all, large language models are biased to English
terms and sentence constructions. This function performs a quick check with
cld2 over every element of a string of
characters and returns whether it is mostly (75
Usage
check_lang(strings, detailed = FALSE)
Value
logical. If TRUE the language of the string is mostly English.
If detailed is TRUE a list is instead returned for the full document.
Arguments
strings
character. Vector of strings containing document sentences.
detailed
bool. If TRUE, the full cld2 report is returned as well.
# English check_lang("Species Macrothele calpeiana is found in Alentejo.")
# Portuguesecheck_lang("A espécie Macrothele calpeiana é encontrada no Alentejo.")