Last chance! 50% off unlimited learning
Sale ends in
This approach considers a contradiction if impossible combinations of data are observed in one participant. For example, if age of a participant is recorded repeatedly the value of age is (unfortunately) not able to decline. Most cases of contradictions rest on comparison of two variables.
Important to note, each value that is used for comparison may represent a possible characteristic but the combination of these two values is considered to be impossible. The approach does not consider implausible or inadmissible values.
Indicator
con_contradictions_redcap(
study_data,
meta_data,
label_col,
threshold_value,
meta_data_cross_item = "cross-item_level",
use_value_labels,
summarize_categories = FALSE
)
If summarize_categories
is FALSE
:
A list with:
FlaggedStudyData
: The first output of the contradiction function is a
data frame of similar dimension regarding the number
of observations in the study data. In addition, for
each applied check on the variables an additional
column is added which flags observations with a
contradiction given the applied check.
SummaryData
: The second output summarizes this information into one
data frame. This output can be used to provide an
executive overview on the amount of contradictions.
VariableGroupTable
: A subset of SummaryData
used within the pipeline.
SummaryPlot
: The third output visualizes summarized information
of SummaryData
.
If summarize_categories
is TRUE
, other objects are returned:
One per category named by that category (e.g. "Empirical") containing a
result for contradiction checks within that category only. Additionally, in the
slot all_checks
, a result as it would have been returned with
summarize_categories
set to FALSE
. Finally, a slot SummaryData
is
returned containing sums per Category and an according ggplot in
SummaryPlot
.
data.frame the data frame that contains the measurements
data.frame the data frame that contains metadata attributes of study data
variable attribute the name of the column in the metadata with labels of variables
numeric from=0 to=100. a numerical value ranging from 0-100
data.frame contradiction rules table. Table defining contradictions. See details for its required structure.
logical Deprecated in favor of DATA_PREPARATION.
If set to TRUE
, labels can be used in the
REDCap
syntax to specify contraction checks for
categorical variables. If set to FALSE
,
contractions have to be specified using the coded
values. In case that this argument is not set in
the function call, it will be set to TRUE
if
the metadata contains a column VALUE_LABELS
which is not empty.
inheritParams
acc_distributions
logical Needs a column 'CONTRADICTION_TYPE' in
the meta_data_cross_item
.
If set, a summary output is generated for the
defined categories plus one plot per
category. TODO: Not yet controllable by metadata.
Remove missing codes from the study data (if defined in the metadata)
Remove measurements deviating from limits defined in the metadata
Assign label to levels of categorical variables (if applicable)
Apply contradiction checks (given as REDCap
-like rules in a separate
metadata table)
Identification of measurements fulfilling contradiction rules. Therefore two output data frames are generated:
on the level of observation to flag each contradictory value combination, and
a summary table for each contradiction check.
A summary plot illustrating the number of contradictions is generated.
List function.
if (FALSE) # slow
load(system.file("extdata", "meta_data.RData", package = "dataquieR"))
load(system.file("extdata", "study_data.RData", package = "dataquieR"))
meta_data_cross_item <- prep_get_data_frame("meta_data_v2|cross-item_level")
label_col <- "LABEL"
threshold_value <- 1
con_contradictions_redcap(
study_data = study_data, meta_data = meta_data, label_col = label_col,
threshold_value = threshold_value, meta_data_cross_item = meta_data_cross_item
)
con_contradictions_redcap(
study_data = study_data, meta_data = meta_data, label_col = label_col,
threshold_value = threshold_value, meta_data_cross_item = meta_data_cross_item,
summarize_categories = TRUE
)
Run the code above in your browser using DataLab