ApplyStatisticalTests
applies the epistemic goodness-of-fit test for
fuzzy data to check the quality of the imputed values.
ApplyStatisticalTests(
trueData,
imputedData,
imputedMask,
trapezoidal = TRUE,
cutsNumber = 100,
K = 50,
...
)
The output is given as a matrix (the rows are related to various types of the test and subsamples, the columns - to the variables plus the overall mean).
Name of the input matrix (or data frame) with the true values of the variables.
Name of the input matrix (or data frame) with the imputed values.
Matrix (or data frame) with logical values where TRUE
indicates the cells with the imputed values.
Logical value depending on the type of fuzzy values (triangular or trapezoidal ones) in the dataset.
Number of cuts for the epistemic bootstrap tests.
Value of K
for the res
epistemic test.
Additional parameters passed to other functions.
The procedure applies three types of the epistemic goodness-of-fit Kolmogorov-Smirnov tests (avs
- averaging statistic,
ms
- multi-statistic, res
- resampling algorithm) from the FuzzySimRes
package to check the quality of the imputed values.
To do this, three subsamples are used:
true
- the dataset trueData
without imputed values vs the values from the same dataset that are then imputed,
imputed
- the dataset trueData
without imputed values vs only the imputed values from imputedData
,
parts
- only the imputed values from the dataset trueData
vs their counterparts from imputedData
.
To assess the respective imputation quality, p-values for true
and imputed
should be close to each other,
and in the case of parts
, they should exceed the selected significance level.
All of the input datasets can be given as matrices or data frames. The statistical tests are performed only for the input values that are proper fuzzy numbers (triangular or trapezoidal ones).
# seed PRNG
set.seed(1234)
# load the necessary library
library(FuzzySimRes)
# generate sample of trapezoidal fuzzy numbers with FuzzySimRes library
list1<-SimulateSample(20,originalPD="rnorm",parOriginalPD=list(mean=0,sd=1),
incrCorePD="rexp", parIncrCorePD=list(rate=2),
suppLeftPD="runif",parSuppLeftPD=list(min=0,max=0.6),
suppRightPD="runif", parSuppRightPD=list(min=0,max=0.6),
type="trapezoidal")
# convert fuzzy data into a matrix
matrix1 <- FuzzyNumbersToMatrix(list1$value)
# check starting values
head(matrix1)
# add some NAs to the matrix
matrix1NA <- IntroducingNA(matrix1,percentage = 0.1)
head(matrix1NA)
# impute missing values (with possible repetitions!)
matrix1DImp <- FuzzyImputation(matrix1NA,method="dimp",checkFuzzy=TRUE)
# find cells with NAs
matrix1Mask <- is.na(matrix1NA)
# apply statistical epistemic bootstrap tests
ApplyStatisticalTests(matrix1,matrix1DImp,matrix1Mask,cutsNumber = 100, K=10)
Run the code above in your browser using DataLab