rgp (version 0.4-1)

dataDrivenGeneticProgramming: Data-driven untyped standard genetic programming


Perform an untyped genetic programming using a fitness function that depends on a R data frame. Typical applications are data mining tasks such as symbolic regression or classification. The task is specified as a formula and a fitness function factory. Only simple formulas without interactions are supported. The result of the data-driven GP run is a model structure containing the formulas and an untyped GP population. This function is primarily an intermediate for extensions. End-users will probably use more specialized GP tools such as symbolicRegression.


dataDrivenGeneticProgramming(formula, data, fitnessFunctionFactory, fitnessFunctionFactoryParameters = list(), stopCondition = makeTimeStopCondition(5), population = NULL, populationSize = 100, eliteSize = ceiling(0.1 * populationSize), elite = list(), extinctionPrevention = FALSE, archive = FALSE, functionSet = mathFunctionSet, constantSet = numericConstantSet, crossoverFunction = NULL, mutationFunction = NULL, restartCondition = makeEmptyRestartCondition(), restartStrategy = makeLocalRestartStrategy(), searchHeuristic = makeAgeFitnessComplexityParetoGpSearchHeuristic(), breedingFitness = function(individual) TRUE, breedingTries = 50, progressMonitor = NULL, verbose = TRUE)


A formula describing the task. Only simple formulas of the form response ~ variable1 + ... + variableN are supported at this point in time.
A data.frame containing training data for the GP run. The variables in formula must match column names in this data frame.
A function that accepts two parameters, a codeformula, data (given as a model frame) and the additional parameters given in fitnessFunctionFactoryParameters and returns a fitness function.
Additional parameters to pass to the fitnessFunctionFactory.
The stop condition for the evolution main loop. See makeStepsStopCondition for details.
The GP population to start the run with. If this parameter is missing, a new GP population of size populationSize is created through random growth.
The number of individuals if a population is to be created.
The number of elite individuals to keep. Defaults to ceiling(0.1 * populationSize).
The elite list, must be alist of individuals sorted in ascending order by their first fitness component.
When set to TRUE, the initialization and selection steps will try to prevent duplicate individuals from occurring in the population. Defaults to FALSE, as this operation might be expensive with larger population sizes.
If set to TRUE, all GP individuals evaluated are stored in an archive list archiveList that is returned as part of the result of this function.
The function set.
The set of constant factory functions.
The crossover function.
The mutation function.
The restart condition for the evolution main loop. See makeEmptyRestartCondition for details.
The strategy for doing restarts. See makeLocalRestartStrategy for details.
The search-heuristic (i.e. optimization algorithm) to use in the search of solutions. See the documentation for searchHeuristics for available algorithms.
A "breeding" function. This function is applied after every stochastic operation Op that creates or modifies an individal (typically, Op is a initialization, mutation, or crossover operation). If the breeding function returns TRUE on the given individual, Op is considered a success. If the breeding function returns FALSE, Op is retried a maximum of breedingTries times. If this maximum number of retries is exceeded, the result of the last try is considered as the result of Op. In the case the breeding function returns a numeric value, the breeding is repeated breedingTries times and the individual with the lowest breeding fitness is considered the result of Op.
In case of a boolean breedingFitness function, the maximum number of retries. In case of a numerical breedingFitness function, the number of breeding steps. Also see the documentation for the breedingFitness parameter. Defaults to 50.
A function of signature function(population, fitnessfunction, stepNumber, evaluationNumber, bestFitness, timeElapsed) to be called with each evolution step.
Whether to print progress messages.


A model structure that contains the formula and an untyped GP population.

See Also
