cvq2( data, formula = NULL, nGroup = N, nRun = 1,
round = 4, extOut = FALSE, extOutFile = NULL )
extOutFile
is not specified, write to stdout()extOut = TRUE
), DEFAULT: NULLresult$cv
with following elements:result$fit
with following elements:nGroup
training and test sets.
The given data set is split into several groups, whereas one group will be the test set and the others are merged as training set.
Each group consist of $k$ elements:
$$k = \left\lceil\frac{N}{nGroup}\right\rceil$$
In general, each test set has size $k$, whereas the training set has size $N-k$.
In case, $\frac{N}{nGroup}$ is a decimal number, some groups consist of $k-1$ elements.
For each test set, the training set with the remaining values is used to construct a model to predict the observed values from the test set.
This model is slighlty different compared to the model for the $r^2$ calculation, which is due to the missing k values.
The difference between the prediction and the observation is used to calculate the PREdictive residual Sum of Squares (PRESS).
Furthermore for any training set, the mean of the observed values, $y_{mean}^{N-k,i}$, is calculated.
With PRESS and $y_{mean}^{N-k,i}$, the modified $q^2_{cv}$ equation is used to calculate the predictive squared correlation coefficient.Additionally, the conventional squared correlation coefficient, $r^2$, is calculated with a linear regression for the entire data set.
library(cvq2)
data(cvq2.setA)
result <- cvq2( cvq2.setA, y ~ x1 + x2 )
result
data(cvq2.setB)
result <- cvq2( cvq2.setB, y ~ x, nGroup = 3 )
result
data(cvq2.setB)
result <- cvq2( cvq2.setB, y ~ x, nGroup = 3, nRun = 5 )
result
Run the code above in your browser using DataLab