hitRate(matrixPIT, interval = c(0.25, 0.75))
ntest
-by-nForecaster
matrix of PIT values where ntest
is the number of rows in the testing set and nForecaster
is the number of forecasters. Each column represents a different forecaster's PITs for the testing set. A PIT value is the forecaster's cdf evaluated at the realization of the response in the testing set.interval=c(0.25, 0.75)
is the central 50% prediction interval.nForecaster
vector of empirical hit rates -- one for each forecaster. A forecaster's empirical hit rate is the percentage of PIT values that fall within [interval[1]
,interval[2]
], e.g., [0.25,0.75] according to the default.trimTrees
, cinbag
# Load the data
set.seed(201) # Can be removed; useful for replication
data <- as.data.frame(mlbench.friedman1(500, sd=1))
summary(data)
# Prepare data for trimming
train <- data[1:400, ]
test <- data[401:500, ]
xtrain <- train[,-11]
ytrain <- train[,11]
xtest <- test[,-11]
ytest <- test[,11]
# Run trimTrees
set.seed(201) # Can be removed; useful for replication
tt <- trimTrees(xtrain, ytrain, xtest, ytest, trim=0.15)
# Outputs from trimTrees
mean(hitRate(tt$treePITs))
hitRate(tt$trimmedEnsemblePITs)
hitRate(tt$untrimmedEnsemblePITs)
Run the code above in your browser using DataLab