Learn R Programming

wsMed (version 1.0.2)

PrepareData: Prepare Data for Two-Condition Within-Subject Mediation (WsMed)

Description

PrepareData() transforms raw pre/post data into the set of variables required by the WsMed workflow. It handles mediators, outcome, within-subject controls, between-subject controls, moderators, and all necessary interaction terms, while automatically centering / dummy-coding variables as needed.

Usage

PrepareData(
  data,
  M_C1,
  M_C2,
  Y_C1,
  Y_C2,
  C_C1 = NULL,
  C_C2 = NULL,
  C = NULL,
  C_type = NULL,
  W = NULL,
  W_type = NULL,
  center_W = TRUE,
  keep_W_raw = TRUE,
  keep_C_raw = TRUE
)

Value

A data frame containing at minimum:

  • Ydiff

  • Mi_diff, Mi_avg for each mediator

  • centered or dummy-coded Cb*, Cw*diff, Cw*avg

  • centered or dummy-coded W* and all int_* interaction terms

plus the attributes "W_info" and "C_info" described above.

Arguments

data

A data frame with the raw pre/post measures.

M_C1, M_C2

Character vectors: mediator names at occasion 1 and 2 (equal length).

Y_C1, Y_C2

Character scalars: outcome names at occasion 1 and 2.

C_C1, C_C2

Optional character vectors: within-subject control names.

C

Optional character vector: between-subject control names.

C_type

Optional vector of the same length as C. Each element is one of "continuous", "categorical", or "auto" (default). Ignored when C = NULL.

W

Optional character vector: moderator names (one or more).

W_type

Optional vector of the same length as W. Same coding as C_type. Ignored when W = NULL.

center_W

Logical. Whether to center the moderator variable W.

keep_W_raw, keep_C_raw

Logical. If TRUE, keep the original W / C columns in the returned data.

Details

The function performs the following steps:

  1. Outcome difference: Ydiff = Y_C2 - Y_C1.

  2. Mediator variables for each pair (M_C1[i], M_C2[i]):

    • Mi_diff = M_C2 - M_C1

    • Mi_avg is the mean-centered average of the two occasions.

  3. Between-subject controls C:

    • Continuous variables are grand-mean centered (Cb1, Cb2, ...).

    • Categorical variables (binary or multi-level) are expanded into k - 1 dummy variables (Cb1_1, Cb2_1, Cb2_2, ...), using the first level as the reference.

  4. Within-subject controls Cw: difference and centered-average versions (Cw1diff, Cw1avg, ...).

  5. Moderators W (one or more):

    • Continuous variables are grand-mean centered (W1, W2, ...).

    • Categorical variables are dummy-coded in the same way as C.

  6. Interaction terms between each moderator column and each mediator column:

    • int_<Mi_diff>_<Wj>, int_<Mi_avg>_<Wj>.

  7. Two attributes are added to the returned data:

    • "W_info": raw names, dummy names, level mapping

    • "C_info": same structure for between-subject controls.

Row counts are preserved even if input factors contain NA values (model.matrix is called with na.action = na.pass).

See Also

PrepareMissingData, GenerateModelP, wsMed

Examples

Run this code
set.seed(1)
raw <- data.frame(
  A1 = rnorm(50), A2 = rnorm(50),   # mediator 1
  B1 = rnorm(50), B2 = rnorm(50),   # mediator 2
  C1 = rnorm(50), C2 = rnorm(50),   # outcome
  D1 = rnorm(50), D2 = rnorm(50),   # within-subject control
  W_bin  = sample(0:1, 50, TRUE),   # between-subject binary C
  W_fac3 = factor(sample(c("Low","Med","High"), 50, TRUE)) # moderator W
)

prep <- PrepareData(
  data  = raw,
  M_C1  = c("A1","B1"), M_C2 = c("A2","B2"),
  Y_C1  = "C1",         Y_C2 = "C2",
  C_C1  = "D1",         C_C2 = "D2",
  C     = "W_bin",      C_type = "categorical",
  W     = "W_fac3",     W_type = "categorical"
)
head(prep)

Run the code above in your browser using DataLab