ImputationTests: Battery of test for the imputed fuzzy values.

Description

`ImputationTests` calculates various measures and applies goodness-of-fit statistical tests to check the quality of the imputed fuzzy values.

Usage

ImputationTests(
  trueData,
  imputedData,
  imputedMask,
  trapezoidal = TRUE,
  cutsNumber = 100,
  K = 50,
  ...
)

Value

The output is an S3 object of the class impTest given as a list of the matrices: trueValues - the true, input values (the same as trueData), mask - the masked (NAs) values (the same as imputedMask), nonFNNumbers - the vector with the numbers of non-FNs samples for each variable (with the overall mean), errorMatrix -- the output from the function ErrorMatrix, statisticalMeasures -- the output from the function StatisticalMeasures,

statisticalTests -- the output from the function ApplyStatisticalTests, fuzzyMeasures -- the output from the function CalculateFuzzyMeasures.

Arguments

trueData: Name of the input matrix (or data frame, or list) with the true values of the variables.
imputedData: Name of the input matrix (or data frame) with the imputed values.
imputedMask: Matrix (or data frame) with logical values where TRUE indicates the cells with the imputed values.
trapezoidal: Logical value depending on the type of fuzzy values (triangular or trapezoidal ones) in the dataset.
cutsNumber: Number of cuts for the epistemic bootstrap tests.
K: Value of K for the res epistemic test.
...: Additional parameters passed to other functions.

Details

The procedure uses other functions embedded in this package to check the quality of the imputed fuzzy values if they are compared with the original ones. This procedure calculates number of non-FNs for each variable, error matrix (using ErrorMatrix), various statistical measures (with StatisticalMeasures), applies epistemic goodness-of-fit tests (using ApplyStatisticalTests), and evaluates the fuzzy measures (with CalculateFuzzyMeasures). Therefore, this function can be directly applied as one-click benchmark tool.

To properly distinguish the real values with their imputed counterparts, the additional matrix imputedMask should be provided. In this matrix, the logical value TRUE points out the cells with the imputed values. Otherwise, FALSE should be used.

All of the input datasets can be given as matrices or data frames.

To get overall comparison of the methods, summary(object,...) can be used for the output object from this method. The values diff are equal to the differences of p-values between the respective tests for the parts true and imputed there.

Examples

Run this code


# seed PRNG

set.seed(1234)

# load the necessary library

library(FuzzySimRes)

# generate sample of trapezoidal fuzzy numbers with FuzzySimRes library

list1<-SimulateSample(20,originalPD="rnorm",parOriginalPD=list(mean=0,sd=1),
incrCorePD="rexp", parIncrCorePD=list(rate=2),
suppLeftPD="runif",parSuppLeftPD=list(min=0,max=0.6),
suppRightPD="runif", parSuppRightPD=list(min=0,max=0.6),
type="trapezoidal")

# convert fuzzy data into a matrix

matrix1 <- FuzzyNumbersToMatrix(list1$value)

# check starting values

head(matrix1)

# add some NAs to the matrix

matrix1NA <- IntroducingNA(matrix1,percentage = 0.1)

head(matrix1NA)

# impute missing values

matrix1DImp <- ImputationDimp(matrix1NA)

# find cells with NAs

matrix1Mask <- is.na(matrix1NA)

# check the quality of the imputed values

ImputationTests(matrix1,matrix1DImp,matrix1Mask,trapezoidal=TRUE)