Learn R Programming

MSCA (version 1.1.1)

make_state_matrices: Construct state matrices from longitudinal EHR Data

Description

Builds a binary matrix (0/1/NA) encoding whether each individual had each long-term condition (LTC) at each time point from 0 to l, based on their age of onset. The matrix includes all LTCs, including those used to determine censoring and failure. However, the presence of fail_code or cens_code still triggers NA values after their onset.

Usage

make_state_matrices(
  data,
  id = "link_id",
  ltc = "reg",
  aos = "aos",
  l = 111,
  fail_code = "death",
  cens_code = "cens"
)

Value

A matrix with (l + 1) * number of LTCs rows and one column per unique individual. Values are 1 after onset, 0 before, and NA after censor/fail. Rows are named <ltc>_<time>, and columns are individual IDs.

Arguments

data

A data frame containing one row per condition occurrence.

id

Name of the column identifying individuals.

ltc

Name of the column containing LTC labels.

aos

Name of the column giving age of onset (or time of onset) for each LTC.

l

The maximum time index (inclusive); matrix has l + 1 time rows per LTC.

fail_code

Label in ltc indicating a failure event (e.g., death).

cens_code

Label in ltc indicating censoring.

Author

@author Marc Delord

References

Delord M, Douiri A (2025) doi:10.1186/s12874-025-02476-7