MRPWorkflow-method-preprocess: Preprocess sample data

Description

The $preprocess() method runs the preprocessing pipeline that includes data standardization, filtering, imputation, and aggregation. See the More on data preparation vignette for more information about data processing. For usage examples, refer to the More examples of R6 classes vignette.

Usage

preprocess(
  data,
  is_timevar = FALSE,
  is_aggregated = FALSE,
  special_case = NULL,
  family = NULL,
  time_freq = NULL,
  freq_threshold = 0
)

Value

No return value, called for side effects.

Arguments

data: An object of class data.frame (or one that can be coerced to that class) that satisfies the requirements specified in the More on data preparation vignette.
is_timevar: Logical indicating whether the data contains time-varying components.
is_aggregated: Logical indicating whether the data is already aggregated.
special_case: Character string specifying special case handling. Options are NULL (the default), "covid", and "poll".
family: Character string specifying the distribution family for the outcome variable. Options are "binomial" for binary outcome measures and "normal" for continuous outcome measures.
time_freq: Character string specifying the time indexing frequency or time length for grouping dates (YYYY-MM-DD) in the data. Options are NULL (the default), "week", "month", and "year". This parameter must be NULL for cross-sectional data or time-varying data that already has time indices.
freq_threshold: Numeric value specifying the minimum frequency threshold for including observations. Values with lower frequency will cause the entire row to be removed. The default value is 0 (no filtering).