Learn R Programming

rminer (version 1.1)

holdout: Computes indexes for holdout data split into training and test sets.

Description

Computes indexes for holdout data split into training and test sets.

Usage

holdout(y, ratio = 2/3, internalsplit = FALSE, mode = "random", iter = 1)

Arguments

y
desired target: numeric vector; or factor -- then a stratified holdout is applied (i.e. the proportions of the classes are the same for each set).
ratio
split ratio (in percentage -- sets the training set size; or in total number of examples -- sets the test set size).
internalsplit
if TRUE then the training data is further split into training and validation sets. The same ratio parameter is used for the internal split.
mode
sampling mode. Options are:
  • random-- standard randomized holdout;
  • order-- static mode, where the first examples are used for training and the later ones for testing (useful for time series data);
iter
iteration of the incremental retraining mode (only used when mode=="incremental", typically iter is set within a cycle, see the example below).

Value

  • A list with the components:
    • $tr -- numeric vector with the training examples indexes;
    • $ts -- numeric vector with the test examples indexes;
    • $itr -- numeric vector with the internal training examples indexes;
    • $val -- numeric vector with the internal validation examples indexes;

Details

Computes indexes for holdout data split into training and test sets (if y is a factor then a stratified holdout is applied).

References

See fit.

See Also

fit, predict.fit, mining, mgraph, mmetric, savemining, Importance.

Examples

Run this code
### simple examples:
H=holdout(1:10,ratio=2,internal=TRUE,mode="order")
print(H)
H=holdout(1:10,ratio=2/3,internal=TRUE,mode="order")
print(H)
H=holdout(1:10,ratio=2/3,internal=TRUE,mode="random")
print(H)
H=holdout(1:10,ratio=2/3,internal=TRUE,mode="random")
print(H)

### classification example
data(iris)
# random stratified holdout
H=holdout(iris$Species,ratio=2/3,internal=TRUE) 
print(summary(iris[H$itr,]))
print(summary(iris[H$val,]))
print(summary(iris[H$tr,]))
print(summary(iris[H$ts,]))
M=fit(Species~.,iris[H$tr,],model="dt") # training data only
P=predict(M,iris[H$ts,]) # test data
print(mmetric(iris$Species[H$ts],P,"CONF"))

### regression example with incremental training
ts=c(1,4,7,2,5,8,3,6,9,4,7,10,5,8,11,6,9)
d=CasesSeries(ts,c(1,2,3))
for(b in 1:3) # iterations
{
 H=holdout(d$y,ratio=4,mode="incremental",iter=b)
 print(H)
 M=fit(y~.,d[H$tr,],model="mlpe",search=2)
 P=predict(M,d[H$ts,])
 cat("batch :",b,"TR size:",length(H$tr),"TS size:",
     length(H$ts),"mae:",mmetric(d$y[H$ts],P,"MAE"),"")
}

Run the code above in your browser using DataLab