franc (version 1.1.2)

franc: Detect the language of a string

Description

Detect the language of a string

Usage

franc(text, min_speakers = 1e+06, whitelist = NULL, blacklist = NULL,
  min_length = 10, max_length = 2048)

Arguments

text

A string constant. Should be at least min_length characters long, this is 10 characters by default. Only the first max_length characters are used (2048 by default), to make the detection reasonably fast.

min_speakers

Languages with at least this many speakers are checked. By default this is one million. Set it to zero to include all languages known by franc. See also speakers.

whitelist

List of three letter language codes to check against.

blacklist

List of three letter language codes not to check againts.

min_length

Minimum number of characters required in the text.

max_length

Maximum number of characters used from the text. By default only the first 2048 characters are used.

Value

A three letter ISO-639-3 language code, the detected language of the text. "und" is returned for too short input.

See Also

franc_all for scores against many languages, speakers.

Examples

Run this code
# NOT RUN {
## afr
franc("Alle menslike wesens word vry")

## nno
franc("Alle mennesker er f<U+00F8>dt frie og")

## Too short, und
franc("the")

## You can change what<U+2019>s too short (default: 10), sco
franc("the", min_length = 3)
# }

Run the code above in your browser using DataCamp Workspace