Add item response data in long or wide format
add_booklet(db, x, booklet_id, auto_add_unknown_rules = FALSE)add_response_data(db, data, auto_add_unknown_rules = FALSE,
missing_value = "NA")
A handle to the database, i.e. the output of start_new_project
or open_project
A data frame containing the responses and, optionally, person_properties. The data.frame should have one row per respondent and the column names should correspond to the item_id's in the rules or the names of the person_properties. See details.
A (short) string identifying the test form (booklet)
If FALSE (the default), an error will be generated if one or more responses do not appear in the scoring rules. If TRUE, unknown responses will be assumed to have a score of 0.
response data in normalized (long) format. Must contain columns person_id
, booklet_id
,
item_id
and response
and optionally item_position
(useful if your data contains new booklets, see details)
value to use for responses in missing rows in your data, see details
A list with information about the recent import.
It is a common practice to keep respons data in tables where each row
contains the responses from a single person. add_booklet
is provided to input
data in that form, one booklet at a time.
If the dataframe x
contains a variable named person_id
this variable
will be used to identify unique persons. It is assumed that a single person will only
make a single booklet once, otherwise an error will be generated.
If a person_id is not supplied, dexter will generate unique person_id's for each row of data.
Any column whose name has an exact match in the scoring rules inputted with
function start_new_project
will be treated as an item; any column whose name has an
exact match in the person_properties will be treated as a person property. If a name matches both
a person_property and an item, the item takes precedence. Columns other than items, person properties
and person_id will be ignored.
add_response_data
can be used to add data that is already 'normalized'. This function takes a
data.frame in long format with columns person_id
, booklet_id
,
item_id
and response
such as can usually be found in databases for example.
The first time a new booklet is encountered,
the design (i.e. which items are contained in each booklet at each position) is derived
from data
. In this case it is useful if you specify an extra column named item_position
,
otherwise dexter will generate the item_positions automatically in some way that may not reflect your actual design
(of course, if the item positions in your tests are randomized, that is not a problem).
If there are missing rows (e.g. there are only 9 rows for a person-booklet where the booklet should contain 10 items)
missing_value
will be used for the omitted responses. This can lead to an error in case missing_value
is not defined in your rules and auto_add_unknown_rules
is set to FALSE (the default). Please also note
that the booklet_design for any specific booklet is derived from the distinct combination of booklet_id and item_id
in data
the first time that booklet is encountered. If subsequent calls to add_response_data
contain data with more/different items for this same booklet, this will cause an error.
Note that responses are always treated as strings (in both functions), and NA
values are transformed to the string "NA"
.
# NOT RUN {
db = start_new_project(verbAggrRules, ":memory:",
person_properties=list(gender="unknown"))
head(verbAggrData)
add_booklet(db, verbAggrData, "agg")
close_project(db)
# }
Run the code above in your browser using DataLab