Learn R Programming

dsBase (version 6.3.5)

mdPatternDS: Missing data pattern with disclosure control

Description

This function is a serverside aggregate function that computes the missing data pattern using mice::md.pattern and applies disclosure control to prevent revealing small cell counts.

Usage

mdPatternDS(x)

Value

A list containing:

pattern

The missing data pattern matrix with disclosure control applied

valid

Logical indicating if all patterns meet disclosure requirements

message

A message describing the validity status

Arguments

x

a character string specifying the name of a data frame or matrix containing the data to analyze for missing patterns.

Author

Xavier Escribà montagut for DataSHIELD Development Team

Details

This function calls the mice::md.pattern function to generate a matrix showing the missing data patterns in the input data. To ensure disclosure control, any pattern counts that are below the threshold (nfilter.tab, default=3) are suppressed.

Suppression Method:

When a pattern count is below threshold: - Row name is changed to "suppressed(<N>)" where N is the threshold - All pattern values in that row are set to NA - Summary row is also set to NA (prevents back-calculation)

Output Matrix Structure:

- Rows represent different missing data patterns (plus a summary row at the bottom) - Row names contain pattern counts (or "suppressed(<N>)" for invalid patterns) - Columns show 1 if variable is observed, 0 if missing - Last column shows total number of missing values per pattern - Last row shows total number of missing values per variable

Note for Pooling:

When this function is called from ds.mdPattern with type='combine', suppressed patterns are excluded from pooling to prevent disclosure through subtraction. This means pooled counts may underestimate the true total when patterns are suppressed in some studies.