powered by
Create a list of tokens
as_tokens( tbl, token_field = "token", pos_field = get_dict_features()[1], nm = NULL )
A named list of tokens.
A tibble of tokens out of tokenize().
tokenize()
<data-masked> Column containing tokens.
data-masked
Column containing features that will be kept as the names of tokens. If you don't need them, give a NULL for this argument.
NULL
Names of returned list. If left with NULL, "doc_id" field of tbl is used instead.
tbl
if (FALSE) { tokenize( data.frame( doc_id = seq_along(5:8), text = ginga[5:8] ) ) |> prettify(col_select = "POS1") |> as_tokens() }
Run the code above in your browser using DataLab