best.first.search: Best-first search

Description

The algorithm for searching atrribute subset space.

Usage

best.first.search(attributes, eval.fun, max.backtracks = 5)

Arguments

attributes

a character vector of all attributes to search in

eval.fun

a function taking as first parameter a character vector of all attributes and returning a numeric indicating how important a given subset is

max.backtracks

an integer indicating a maximum allowed number of backtracks, default is 5

Value

A character vector of selected attributes.

Details

The algorithm is similar to forward.search besides the fact that is chooses the best node from all already evaluated ones and evaluates it. The selection of the best node is repeated approximately max.brackets times in case no better node found.

Examples

Run this code

# NOT RUN {
  library(rpart)
  data(iris)
  
  evaluator <- function(subset) {
    #k-fold cross validation
    k <- 5
    splits <- runif(nrow(iris))
    results = sapply(1:k, function(i) {
      test.idx <- (splits >= (i - 1) / k) & (splits < i / k)
      train.idx <- !test.idx
      test <- iris[test.idx, , drop=FALSE]
      train <- iris[train.idx, , drop=FALSE]
      tree <- rpart(as.simple.formula(subset, "Species"), train)
      error.rate = sum(test$Species != predict(tree, test, type="c")) / nrow(test)
      return(1 - error.rate)
    })
    print(subset)
    print(mean(results))
    return(mean(results))
  }
  
  subset <- best.first.search(names(iris)[-5], evaluator)
  f <- as.simple.formula(subset, "Species")
  print(f)

  
# }

Run the code above in your browser using DataLab