
Create a list of tokens
as_tokens(
tbl,
token_field = "token",
pos_field = get_dict_features()[1],
nm = NULL
)
A named list of tokens.
A tibble of tokens out of tokenize()
.
<data-masked
>
Column containing tokens.
Column containing features
that will be kept as the names of tokens.
If you don't need them, give a NULL
for this argument.
Names of returned list.
If left with NULL
, "doc_id" field of tbl
is used instead.
if (FALSE) {
tokenize(
data.frame(
doc_id = seq_along(5:8),
text = ginga[5:8]
)
) |>
prettify(col_select = "POS1") |>
as_tokens()
}
Run the code above in your browser using DataLab