Learn R Programming

finnsurveytext (version 2.0.0)

fst_format: Annotate open-ended survey responses in Finnish into CoNLL-U format

Description

Creates a dataframe in CoNLL-U format from a dataframe containing Finnish text from using the [udpipe] package and a Finnish language model plus any additional columns that are included such as `weights` or columns added through `add_cols`.

Usage

fst_format(data, question, id, model = "ftb", weights = NULL, add_cols = NULL)

Value

Dataframe of annotated text in CoNLL-U format plus any additional columns.

Arguments

data

A dataframe of survey responses which contains an open-ended question.

question

The column in the dataframe which contains the open-ended question.

id

The column in the dataframe which contains the ids for the responses.

model

A language model available for [udpipe]. `"ftb"` (default) or `"tdt"` are recognised as shorthand for "finnish-ftb" and "finnish-tdt". The full list is available in the [udpipe] documentation.

weights

Optional, the column of the dataframe which contains the respective weights for each response.

add_cols

Optional, a column (or columns) from the dataframe which contain other information you'd like to retain (for instance, covariate columnns for splitting the data for comparison plots).

Examples

Run this code
# \donttest{
i <- "fsd_id"
fst_format(data = child, question = "q7", id = i)
fst_format(data = child, question = "q7", id = i, model = "tdt")
fst_format(data = child, question = "q7", id = i, weights="paino")
cols <- c("gender", "major_region", "daycare_before_school")
fst_format(child, question = "q7", id = i, add_cols = cols)
fst_format(child, question = "q7", id = i, add_cols = "gender, major_region")
fst_format(child, question = 'q7', id = i, model = 'swedish-talbanken')
unlink("finnish-ftb-ud-2.5-191206.udpipe")
unlink("finnish-tdt-ud-2.5-191206.udpipe")
unlink("swedish-talkbanken-ud-2.5-191206.udpipe")
# }

Run the code above in your browser using DataLab