looq2( modelData, formula = NULL, round = 4, extOut = FALSE,
extOutFile = NULL )
cvq2( modelData, formula = NULL, nFold = N, nRun = 1,
round = 4, extOut = FALSE, extOutFile = NULL )
q2( modelData, predictData, formula = NULL, round = 4,
extOut = FALSE, extOutFile = NULL )
modelData
is randomly partitioned into n equal sized subsets (test sets) during each run of cross-validation, DEFAULT: N, $2 extOutFile
is not specified, write to stdout()extOut = TRUE
), DEFAULT: NULLq2()-method
q2
returns an object of class "q2 "
.
It contains information about the model calibration and its prediction performance on the external data set.
}
cvq2()-method, looq2()-method
cvq2
and looq2
return an object of class "cvq2 "
.
It contains information about the model calibration and its prediction performance described by the model data set.
Furthermore this object contains data about the cross-validation applied to the model data set.
}modelData
, including the conventional squared correlation coefficient, $r^2$, is calculated with a linear regression.
q2()-method
qsq()
, qsquare()
The model described by modelData
is used to predict the observations of predictData
. These predictions are used in the $q^2_{tr}$ equation to calculate the predictive squared correlation coefficient.
}
cvq2()-method
cvqsq()
, cvqsquare()
A cross-validation is performed for modelData
, whereas modelData
($N$ elements) is split into nFold
disjunct and equal sized test sets (subsets).
Each test set consists of $k$ elements:
$$k = \left\lceil\frac{N}{nFold}\right\rceil$$
In case, $\frac{N}{nFold}$ is a decimal number, some test sets consist of $k-1$ elements.
The remaining $N-k$ elements are merged together as training set for this test set and describe the model M'.
This model is used to predict the observations in the test set.
Note, that M' is slighlty different compared to the model M for the $r^2$-calculation, which is a result of the missing k values.
Each observation from modelData
is predicted once.
The difference between the prediction and the observation within the test sets is used to calculate the PREdictive residual Sum of Squares (PRESS).
Furthermore for any training set, the mean of the observed values, $y_{mean}^{N-k,i}$, is calculated.
With PRESS and $y_{mean}^{N-k,i}$, the modified $q^2_{cv}$ equation is used to calculate the predictive squared correlation coefficient.
In case $k > 1$ one can repeat the cross-validation to overcome biasing.
Therefore, in each iteration ($\code{nRun} = 1 \ldots x$), the test sets are compiled individually by random.
Within one iteration, each observation is predicted once.
If $\code{nFold} = N$, one need one iteration only. }
looq2()-method
nRun = 1
) only.
}library(cvq2)
data(cvq2.setA)
result <- cvq2( cvq2.setA, y ~ x1 + x2 )
result
data(cvq2.setB)
result <- cvq2( cvq2.setB, y ~ x, nFold = 3 )
result
data(cvq2.setB)
result <- cvq2( cvq2.setB, y ~ x, nFold = 3, nRun = 5 )
result
data(cvq2.setA)
result <- looq2( cvq2.setA, y~x1+x2 )
result
data(cvq2.setA)
data(cvq2.setA_pred)
result <- q2( cvq2.setA, cvq2.setA, y~x1+x2 )
result
Run the code above in your browser using DataLab