Learn R Programming

MonteCarloSEM (version 2.0.0)

MNAR.data: Introduces Missing Not at Random (MNAR) Values into Data Sets

Description

This function introduces missing values under the Missing Not at Random (MNAR) mechanism into previously generated data sets (e.g., those produced by sim.skewed() or sim.normal()). Under the MNAR mechanism, the probability of missingness depends on the observed values of the variable itself. Specifically, the target variable is first sorted in decreasing order. Based on the specified percentage of missingness, 90 percents of missing values are assigned randomly among the highest values, while the remaining 10 percents are assigned randomly among the rest of the sample. For example, with a sample size of 300 and a target of 20 percents missingness (60 cases), the variable is sorted in descending order. Missing values are then introduced in 54 cases (90 percents of 60) from the top of the distribution, while the remaining 6 cases (10 percents of 60) are randomly chosen from the lower 240 observations. The missing values are represented by NA in the output files. New data sets containing missing values are saved as separate files, preserving the originals. Additionally, a file named "MNAR_List.dat" is created, which contains the names of all data sets with MNAR missingness.

Usage

MNAR.data(misg = NULL, perct = 10, dataList = "Data_List.dat", f.loc)

Arguments

misg

A numeric vector of 0s and 1s specifying which items will contain missing values. A value of 0 indicates the item will not include missingness, while 1 indicates missing values will be introduced. If omitted, all items are treated as eligible for missingness.

perct

The percentage of missingness to be applied (default = 10 percents).

dataList

The file name containing the list of previously generated data sets (e.g., "Data_List.dat"), either created by this package or by external software.

f.loc

The directory path where both the original data sets and the "dataList" file are located.

Author

Fatih Orcan

Examples

Run this code

# Step 1: Generate data sets

fc<-fcors.value(nf=3, cors=c(1,.5,.6,.5,1,.4,.6,.4,1))
fl<-loading.value(nf=3, fl.loads=c(.5,.5,.5,0,0,0,0,0,0,0,0,.6,.6,.6,0,0,0,0,0,0,0,0,.4,.4))
floc<-tempdir()
sim.normal(nd=10, ss=100, fcors=fc, loading<-fl,  f.loc=floc)

 # Step 2: Introduce MNAR missing values

mis.items<-c(1,1,1,0,0,0,0,0)
dl<-"Data_List.dat"  # must be located in the working directory
MNAR.data(misg = mis.items, perct = 20, dataList = dl, f.loc=floc)

Run the code above in your browser using DataLab