Adds LOCF records as new observations for each 'by group' when the dataset does not contain observations for missed visits/time points and when analysis value is missing.
derive_locf_records(
dataset,
dataset_ref,
by_vars,
id_vars_ref = NULL,
analysis_var = AVAL,
imputation = "add",
order,
keep_vars = NULL
)The input dataset with the new "LOCF" observations added for each
by_vars, based on the value passed to the imputation argument.
Input dataset
The variables specified by the by_vars, analysis_var, order, and keep_vars arguments are expected to be in the dataset.
none
Expected observations dataset
Data frame with all the combinations of PARAMCD, PARAM, AVISIT,
AVISITN, ... which are expected in the dataset is expected.
none
Grouping variables
For each group defined by by_vars those observations from dataset_ref
are added to the output dataset which do not have a corresponding observation
in the input dataset or for which analysis_var is NA for the corresponding observation
in the input dataset.
none
Grouping variables in expected observations dataset
The variables to group by in dataset_ref when determining which observations should be
added to the input dataset.
All the variables in dataset_ref
Analysis variable.
a variable
AVAL
Select the mode of imputation:
add: Keep all original records and add imputed records for missing
timepoints and missing analysis_var values from dataset_ref.
update: Update records with missing analysis_var and add imputed records
for missing timepoints from dataset_ref.
update_add: Keep all original records, update records with missing analysis_var
and add imputed records for missing timepoints from dataset_ref.
One of these 3 values: "add", "update", "update_add"
"add"
Sort order
The dataset is sorted by order before carrying the last observation
forward (e.g. AVAL) within each by_vars.
For handling of NAs in sorting variables see the "Sort Order" section in vignette("generic").
none
Variables that need carrying the last observation forward
Keep variables that need carrying the last observation forward other than analysis_var
(e.g., PARAMN, VISITNUM). If by default NULL, only variables specified in
by_vars and analysis_var will be populated in the newly created records.
NULL
G Gayatri
Imputed records should be added for missing timepoints and for missing
analysis_var (from dataset_ref), while retaining all original records.
The reference dataset for the imputed records is specified by the dataset_add
argument. It should contain all expected combinations of variables. In this case,
advs_expected_obsv is created by crossing() datasets paramcd and avisit, which
includes all combinations of PARAMCD, AVISITN, and AVISIT.
The groups for which new records are added are specified by the by_vars
argument. Here, one record should be added for each subject and parameter.
Therefore, by_vars = exprs(STUDYID, USUBJID, PARAMCD) is specified.
The imputation method is specified using the imputation argument. In this case,
records with missing analysis values add records from dataset_ref after the
data are sorted by the variables in by_vars and by visit (AVISITN and AVISIT),
as specified in the order argument.
Variables other than analysis_var and by_vars that require LOCF (Last-Observation-
Carried-Forward handling (in this case, PARAMN) are specified in the keep_vars
argument.
library(dplyr)
library(tibble)
library(tidyr)advs <- tribble(
~STUDYID, ~USUBJID, ~VSSEQ, ~PARAMCD, ~PARAMN, ~AVAL, ~AVISITN, ~AVISIT,
"CDISC01", "01-701-1015", 1, "PULSE", 1, 65, 0, "BASELINE",
"CDISC01", "01-701-1015", 2, "DIABP", 2, 79, 0, "BASELINE",
"CDISC01", "01-701-1015", 3, "DIABP", 2, 80, 2, "WEEK 2",
"CDISC01", "01-701-1015", 4, "DIABP", 2, NA, 4, "WEEK 4",
"CDISC01", "01-701-1015", 5, "DIABP", 2, NA, 6, "WEEK 6",
"CDISC01", "01-701-1015", 6, "SYSBP", 3, 130, 0, "BASELINE",
"CDISC01", "01-701-1015", 7, "SYSBP", 3, 132, 2, "WEEK 2"
)
paramcd <- tribble(
~PARAMCD,
"PULSE",
"DIABP",
"SYSBP"
)
avisit <- tribble(
~AVISITN, ~AVISIT,
0, "BASELINE",
2, "WEEK 2",
4, "WEEK 4",
6, "WEEK 6"
)
advs_expected_obsv <- paramcd %>%
crossing(avisit)
derive_locf_records(
dataset = advs,
dataset_ref = advs_expected_obsv,
by_vars = exprs(STUDYID, USUBJID, PARAMCD),
imputation = "add",
order = exprs(AVISITN, AVISIT),
keep_vars = exprs(PARAMN)
) |>
arrange(USUBJID, PARAMCD, AVISIT)
#> # A tibble: 14 × 9
#> STUDYID USUBJID VSSEQ PARAMCD PARAMN AVAL AVISITN AVISIT DTYPE
#> <chr> <chr> <dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 CDISC01 01-701-1015 2 DIABP 2 79 0 BASELINE <NA>
#> 2 CDISC01 01-701-1015 3 DIABP 2 80 2 WEEK 2 <NA>
#> 3 CDISC01 01-701-1015 NA DIABP 2 80 4 WEEK 4 LOCF
#> 4 CDISC01 01-701-1015 4 DIABP 2 NA 4 WEEK 4 <NA>
#> 5 CDISC01 01-701-1015 NA DIABP 2 80 6 WEEK 6 LOCF
#> 6 CDISC01 01-701-1015 5 DIABP 2 NA 6 WEEK 6 <NA>
#> 7 CDISC01 01-701-1015 1 PULSE 1 65 0 BASELINE <NA>
#> 8 CDISC01 01-701-1015 NA PULSE 1 65 2 WEEK 2 LOCF
#> 9 CDISC01 01-701-1015 NA PULSE 1 65 4 WEEK 4 LOCF
#> 10 CDISC01 01-701-1015 NA PULSE 1 65 6 WEEK 6 LOCF
#> 11 CDISC01 01-701-1015 6 SYSBP 3 130 0 BASELINE <NA>
#> 12 CDISC01 01-701-1015 7 SYSBP 3 132 2 WEEK 2 <NA>
#> 13 CDISC01 01-701-1015 NA SYSBP 3 132 4 WEEK 4 LOCF
#> 14 CDISC01 01-701-1015 NA SYSBP 3 132 6 WEEK 6 LOCF
When the imputation mode is set to update, missing analysis_var values
are updated using values from the last record after the dataset is sorted by
by_vars and order. Imputed records are added for missing timepoints (from
dataset_ref).
derive_locf_records(
dataset = advs,
dataset_ref = advs_expected_obsv,
by_vars = exprs(STUDYID, USUBJID, PARAMCD),
imputation = "update",
order = exprs(AVISITN, AVISIT),
) |>
arrange(USUBJID, PARAMCD, AVISIT)
#> # A tibble: 12 × 9
#> STUDYID USUBJID VSSEQ PARAMCD PARAMN AVAL AVISITN AVISIT DTYPE
#> <chr> <chr> <dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 CDISC01 01-701-1015 2 DIABP 2 79 0 BASELINE <NA>
#> 2 CDISC01 01-701-1015 3 DIABP 2 80 2 WEEK 2 <NA>
#> 3 CDISC01 01-701-1015 4 DIABP 2 80 4 WEEK 4 LOCF
#> 4 CDISC01 01-701-1015 5 DIABP 2 80 6 WEEK 6 LOCF
#> 5 CDISC01 01-701-1015 1 PULSE 1 65 0 BASELINE <NA>
#> 6 CDISC01 01-701-1015 NA PULSE NA 65 2 WEEK 2 LOCF
#> 7 CDISC01 01-701-1015 NA PULSE NA 65 4 WEEK 4 LOCF
#> 8 CDISC01 01-701-1015 NA PULSE NA 65 6 WEEK 6 LOCF
#> 9 CDISC01 01-701-1015 6 SYSBP 3 130 0 BASELINE <NA>
#> 10 CDISC01 01-701-1015 7 SYSBP 3 132 2 WEEK 2 <NA>
#> 11 CDISC01 01-701-1015 NA SYSBP NA 132 4 WEEK 4 LOCF
#> 12 CDISC01 01-701-1015 NA SYSBP NA 132 6 WEEK 6 LOCF
When the imputation mode is set to update_add, the missing analysis_var
values are updated using values from the last record after the dataset is sorted
by by_vars and order. The updated values are added as new records, while the
original records with missing analysis_var are retained. Imputed records are added
for missing timepoints (from dataset_ref).
derive_locf_records(
dataset = advs,
dataset_ref = advs_expected_obsv,
by_vars = exprs(STUDYID, USUBJID, PARAMCD),
imputation = "update_add",
order = exprs(AVISITN, AVISIT),
) |>
arrange(USUBJID, PARAMCD, AVISIT)
#> # A tibble: 14 × 9
#> STUDYID USUBJID VSSEQ PARAMCD PARAMN AVAL AVISITN AVISIT DTYPE
#> <chr> <chr> <dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 CDISC01 01-701-1015 2 DIABP 2 79 0 BASELINE <NA>
#> 2 CDISC01 01-701-1015 3 DIABP 2 80 2 WEEK 2 <NA>
#> 3 CDISC01 01-701-1015 4 DIABP 2 80 4 WEEK 4 LOCF
#> 4 CDISC01 01-701-1015 4 DIABP 2 NA 4 WEEK 4 <NA>
#> 5 CDISC01 01-701-1015 5 DIABP 2 80 6 WEEK 6 LOCF
#> 6 CDISC01 01-701-1015 5 DIABP 2 NA 6 WEEK 6 <NA>
#> 7 CDISC01 01-701-1015 1 PULSE 1 65 0 BASELINE <NA>
#> 8 CDISC01 01-701-1015 NA PULSE NA 65 2 WEEK 2 LOCF
#> 9 CDISC01 01-701-1015 NA PULSE NA 65 4 WEEK 4 LOCF
#> 10 CDISC01 01-701-1015 NA PULSE NA 65 6 WEEK 6 LOCF
#> 11 CDISC01 01-701-1015 6 SYSBP 3 130 0 BASELINE <NA>
#> 12 CDISC01 01-701-1015 7 SYSBP 3 132 2 WEEK 2 <NA>
#> 13 CDISC01 01-701-1015 NA SYSBP NA 132 4 WEEK 4 LOCF
#> 14 CDISC01 01-701-1015 NA SYSBP NA 132 6 WEEK 6 LOCF For each group (with respect to the variables specified for the
by_vars parameter) those observations from dataset_ref are added to
the output dataset
which do not have a corresponding observation in the input dataset or
for which analysis_var is NA for the corresponding observation in the input dataset.
For the new observations, analysis_var is set to the non-missing analysis_var of the
previous observation in the input dataset (when sorted by order) and
DTYPE is set to "LOCF".
The imputation argument decides whether to update the existing observation when
analysis_var is NA ("update" and "update_add"), or to add a new observation from
dataset_ref instead ("add").
BDS-Findings Functions for adding Parameters/Records:
default_qtc_paramcd(),
derive_expected_records(),
derive_extreme_event(),
derive_extreme_records(),
derive_param_bmi(),
derive_param_bsa(),
derive_param_computed(),
derive_param_doseint(),
derive_param_exist_flag(),
derive_param_exposure(),
derive_param_framingham(),
derive_param_map(),
derive_param_qtc(),
derive_param_rr(),
derive_param_wbc_abs(),
derive_summary_records()