This function performs cross-validation-based pattern testing for Simplivariate Components in a SIMPLICA object. It evaluates different pattern functions using cross-validation and selects the best performing pattern for each component. Fitters are required for all patterns with no fallback options.
simplicaCV(
foundObject,
df,
patternFunctions = defaultPatternFunctions(),
patternFitters = defaultPatternFitters(),
preferenceOrder = names(patternFunctions),
nRepeats = 40,
testFraction = 0.2,
minCellsForModels = 25,
parsimonyMargin = 0.05,
requireFitters = TRUE,
updateObject = TRUE,
verbose = FALSE,
ignoreNaComponents = TRUE
)If updateObject = TRUE, returns the input simplica object
with two new fields:
componentPatternsUpdatedCharacter vector with the selected
pattern per component after cross-validation. If a component is skipped or empty,
the entry is NA.
componentAuditData frame containing detailed cross-validation results for each component, with the following columns:
componentIdNumeric ID of the component.
originalPatternPattern label originally assigned.
selectedPatternPattern chosen after CV-based evaluation.
reasonExplanation of why a pattern was selected or skipped.
nRows, nCols, nCellsDimensions of the component.
nRepeats, testFraction, parsimonyMarginCV settings used.
cvMean_<pattern>Mean RMSE over CV folds for each tested pattern.
cvSd_<pattern>Standard deviation of RMSE across CV folds.
winFrac_<pattern>Fraction of CV repeats where the pattern was the best performer.
If updateObject = FALSE, returns a list with the same two elements
(componentPatternsUpdated, componentAudit).
A simplica object containing Simplivariate Components
Data frame or matrix with the original data
List of pattern functions to evaluate (default: defaultPatternFunctions())
List of pattern fitting functions (default: defaultPatternFitters())
Character vector specifying preference order for pattern selection (default: names(patternFunctions))
Integer, number of cross-validation repeats (default: 40)
Numeric, fraction of data to use for testing (default: 0.2)
Integer, minimum number of cells required for model fitting (default: 25)
Numeric, margin for parsimony-based model selection (default: 0.05)
Logical, whether fitters are required for all patterns (default: TRUE)
Logical, whether to update and return the input object (default: TRUE)
Logical, whether to print progress messages (default: FALSE)
Logical, whether to skip components with NA patterns (default: TRUE)
The function performs the following steps:
Validates the input simplica object and data dimensions
Checks that all pattern functions have corresponding fitters
For each simplivariate component, performs cross-validation pattern evaluation
Selects the best performing pattern based on RMSE and parsimony
Updates component patterns and provides detailed test information