create_pyDarwinOptions: Create pyDarwin Options

Description

Generates a list of parameters to be used in a pyDarwin run.

Usage

create_pyDarwinOptions(
  author = "",
  project_name = NULL,
  algorithm = c("GA", "EX", "MOGA", "MOGA3", "GP", "RF", "GBRT", "PSO"),
  GA = pyDarwinOptionsGA(),
  MOGA = pyDarwinOptionsMOGA(),
  PSO = pyDarwinOptionsPSO(),
  random_seed = 11,
  num_parallel = 4,
  num_generations = 6,
  population_size = 4,
  num_opt_chains = 4,
  exhaustive_batch_size = 100,
  crash_value = 99999999,
  penalty = pyDarwinOptionsPenalty(),
  effect_limit = -1,
  downhill_period = 2,
  num_niches = 2,
  niche_radius = 2,
  local_2_bit_search = TRUE,
  final_downhill_search = TRUE,
  local_grid_search = FALSE,
  max_local_grid_search_bits = 5,
  search_omega_blocks = FALSE,
  search_omega_bands = FALSE,
  individual_omega_search = TRUE,
  search_omega_sub_matrix = FALSE,
  max_omega_sub_matrix = 4,
  model_run_timeout = 1200,
  model_run_priority_class = c("below_normal", "normal"),
  postprocess = pyDarwinOptionsPostprocess(),
  keep_key_models = TRUE,
  keep_best_models = TRUE,
  rerun_key_models = FALSE,
  rerun_front_models = TRUE,
  use_saved_models = FALSE,
  saved_models_file = "{working_dir}/models0.json",
  saved_models_readonly = FALSE,
  remove_run_dir = FALSE,
  remove_temp_dir = FALSE,
  keep_files = c("dmp.txt", "posthoc.csv"),
  keep_extensions = NULL,
  use_system_options = TRUE,
  model_cache = "darwin.MemoryModelCache",
  model_run_man = c("darwin.LocalRunManager", "darwin.GridRunManager"),
  engine_adapter = c("nlme", "nonmem"),
  skip_running = FALSE,
  working_dir = NULL,
  data_dir = NULL,
  output_dir = "{working_dir}/output",
  temp_dir = NULL,
  key_models_dir = "{working_dir}/key_models",
  non_dominated_models_dir = "{working_dir}/non_dominated_models",
  nlme_dir = "C:/Program Files/Certara/NLME_Engine",
  gcc_dir = "C:/Program Files/Certara/mingw64",
  nmfe_path = NULL,
  rscript_path = file.path(normalizePath(R.home("bin")), "Rscript"),
  generic_grid_adapter = pyDarwinOptionsGridAdapter(),
  remote_run = FALSE,
  ...
)

Value

A list of pyDarwin options.

Arguments

author: Character string: The name of the author.
project_name: Character string (optional): The name of the project. If not specified, pyDarwin will set its value to the name of the parent folder of the options file.
algorithm: Character string: One of EX, GA, MOGA, MOGA3, GP, RF, GBRT, PSO. See section Details below for more information.
GA: List: Options specific to the Genetic Algorithm (GA). See pyDarwinOptionsGA(). Ignored if algorithm is not "GA".
MOGA: List: Options specific to the Multi-Objective Genetic Algorithm (MOGA or MOGA3). See pyDarwinOptionsMOGA(). Ignored if algorithm is not "MOGA" or "MOGA3".
PSO: List: Options specific to the Particle Swarm Optimization (PSO). See pyDarwinOptionsPSO(). Ignored if algorithm is not "PSO".
random_seed: Positive integer: Seed for random number generation.
num_parallel: Positive integer: Number of models to execute in parallel, i.e., how many threads to create to handle model runs. Default: 4.
num_generations: Positive integer: Number of iterations or generations of the search algorithm to run. Not used/required for EX. Default: 6.
population_size: Positive integer: Number of models to create in every generation. Not used/required for EX. Default: 4.
num_opt_chains: Positive integer: Number of parallel processes to perform the "ask" step (to increase performance). Required only for GP, RF, and GBRT. Default: 4.
exhaustive_batch_size: Positive integer: Batch size for the EX (Exhaustive Search) algorithm. Default: 100.
crash_value: Positive real: Value of fitness or reward assigned when model output is not generated. Should be set larger than any anticipated completed model fitness. Default: 99999999.
penalty: List: Options specific to the penalty calculation. See pyDarwinOptionsPenalty().
effect_limit: Integer: Limits number of effects. Applicable only for NONMEM and GA/MOGA/MOGA3. If < 1, effect limit is turned off. Default: -1.
downhill_period: Integer: How often to run the downhill step. If < 1, no periodic downhill search will be performed. Default: 2.
num_niches: Integer: Used for GA and downhill. A penalty is assigned for each model based on the number of similar models within a niche radius. This penalty is applied only to the selection process (not to the fitness of the model). The purpose is to ensure maintaining a degree of diversity in the population. num_niches is also used to select the number of models that are entered into the downhill step for all algorithms, except EX. Default: 2.
niche_radius: Positive real: The radius of the niches. Used to define how similar pairs of models are, for Local search and GA sharing penalty. Default: 2.
local_2_bit_search: Logical: Whether to perform the two-bit local search. Substantially increases search robustness. Done starting from num_niches models. Ignored for MOGA and MOGA3. Default: TRUE.
final_downhill_search: Logical: Whether to perform a local search (1-bit and 2-bit) at the end of the global search. Default: TRUE.
local_grid_search: Logical: Whether to perform a local grid search during downhill. Default: FALSE.
max_local_grid_search_bits: Positive integer: Maximum number of bits to explore in the local grid search. Default: 5.
search_omega_blocks: Logical: Whether to perform search for block omegas. Used only when engine_adapter == 'nlme'. Default: FALSE.
search_omega_bands: Logical: Whether to perform search for band omegas. Used only when engine_adapter == 'nonmem'. Default: FALSE.
individual_omega_search: Logical: If TRUE, every omega search block is handled individually. If FALSE, all search blocks have the same pattern. Default: TRUE.
search_omega_sub_matrix: Logical: Set to TRUE to search omega submatrix. Default: FALSE.
max_omega_sub_matrix: Integer: Maximum size of sub matrix to use in search. Default: 4.
model_run_timeout: Positive real: Time (seconds) after which the execution will be terminated, and the crash value assigned. Default: 1200.
model_run_priority_class: Character string (Windows only): Priority class for child processes. Options are below_normal (default) and normal.
postprocess: List: Options specific to postprocessing. See pyDarwinOptionsPostprocess(). For algorithm = "MOGA3", postprocessing is required to define objectives and constraints. For algorithm = "MOGA" (NSGA-II), pyDarwin does not use postprocessing for objective calculation.
keep_key_models: Logical: Whether to save the best model from every generation to key_models_dir. Default: TRUE.
keep_best_models: Logical: If TRUE (default), saves only "key" models that represent an improvement in fitness value compared to the previous overall best model. Models are saved to key_models_dir. Not applicable to Exhaustive Search (EX). Default: TRUE.
rerun_key_models: Logical: Whether to re-run key models that lack output after the search. Default: FALSE.
rerun_front_models: Logical: Similar to rerun_key_models, but for non-dominated models (typically from MOGA/MOGA3). Models are copied to non_dominated_models_dir. Default: TRUE.
use_saved_models: Logical: Whether to restore saved Model Cache from file. Default: FALSE.
saved_models_file: Character string: The file from which to restore Model Cache. Default: "{working_dir}/models0.json".
saved_models_readonly: Logical: Do not overwrite the saved_models_file content. Default: FALSE.
remove_run_dir: Logical: If TRUE, delete the entire model run directory, otherwise only unnecessary files. Default: FALSE.
remove_temp_dir: Logical: Whether to delete the entire temp_dir after the search. Default: FALSE
keep_files: Character vector (optional): List of exact file names to keep when cleaning up run directories. Default is c("dmp.txt", "posthoc.csv") when engine_adapter is "nlme".
keep_extensions: Character vector (optional): List of file extensions (without dot) to keep. Default: NULL.
use_system_options: Logical: Whether to override options with environment-specific values. Default: TRUE.
model_cache: Character string: ModelCache subclass to be used. Default: "darwin.MemoryModelCache".
model_run_man: Character string: ModelRunManager subclass to be used. Options: "darwin.LocalRunManager" (default), "darwin.GridRunManager".
engine_adapter: Character string: ModelEngineAdapter subclass. Options: "nlme" (default), "nonmem".
skip_running: Logical: If TRUE, no actual NM/NLME runs will be performed. Default: FALSE.
working_dir: Character string (optional): Project's working directory.
data_dir: Character string (optional): Directory for datasets.
output_dir: Character string: Directory for pyDarwin output. Default: "{working_dir}/output".
temp_dir: Character string (optional): Parent directory for model run subdirectories.
key_models_dir: Character string: Directory where key/best models will be saved. Default: "{working_dir}/key_models".
non_dominated_models_dir: Character string: Directory where non-dominated models will be saved (typically for MOGA/MOGA3). Default: "{working_dir}/non_dominated_models".
nlme_dir: Character string (optional): Directory for NLME Engine installation.
gcc_dir: Character string (optional): Directory for Mingw-w64 compiler.
nmfe_path: Character string (optional): Path to NONMEM execution command.
rscript_path: Character string (optional): Path to Rscript executable.
generic_grid_adapter: List: Options for grid execution. See pyDarwinOptionsGridAdapter(). Used if model_run_man == "darwin.GridRunManager".
remote_run: Logical: Indicates if pyDarwin execution is for a remote host. Default: FALSE.
...: Additional parameters.

Details

The algorithm parameter specifies the search algorithm. The algorithm “MOGA” and “MOGA3” are used for multi-objective optimization: "MOGA" uses NSGA-II (see the documentation at https://pymoo.org/algorithms/moo/nsga2.html?highlight=nsga%20ii), and "MOGA3" uses NSGA-III (see the documentation at https://pymoo.org/algorithms/moo/nsga3.html?highlight=nsga%20ii). For MOGA3, the objectives and constraints must be defined and returned by postprocessing scripts (post_run_r_code or post_run_python_code) in a specific format:

R scripts should return a list of two vectors: the first vector is for the objectives and the second one is for the constraints. If no constraints, the second vector should be empty.
Python scripts should return a tuple of two lists: the first list is for the objectives and the second one is for the constraints). If no constraints, the second list should be empty.

Other algorithms include "EX" (Exhaustive), "GA" (Genetic Algorithm), "GP" (Gaussian Process), "RF" (Random Forest), "GBRT" (Gradient Boosted Random Tree), and "PSO" (Particle Swarm Optimization).

Please see pyDarwin documentation for complete details on all options.

Examples

Run this code

# Basic options with GA
ga_opts <- create_pyDarwinOptions(author = "Jane Doe", algorithm = "GA")

# Options for MOGA (NSGA-II)
# pyDarwin internally uses 2 objectives; postprocessing for objectives is not used by pyDarwin.
moga_opts_nsga2 <- create_pyDarwinOptions(
  author = "J. Doe",
  project_name = "MOGA_Test_NSGA2",
  algorithm = "MOGA", # NSGA-II
  MOGA = pyDarwinOptionsMOGA(), # Default MOGA options are suitable
  population_size = 50,
  num_generations = 100,
  engine_adapter = "nonmem",
  nmfe_path = "/opt/NONMEM/nm75/run/nmfe75"
)

# Options for MOGA3 (NSGA-III with 3 objectives, 1 constraint via R postprocessing)
moga_opts_nsga3_custom <- pyDarwinOptionsMOGA(
  objectives = 3,
  names = c("AIC", "NumEffects", "RunTime"), # Example custom names
  constraints = 1,
  partitions = 10 # Custom partitions
)
main_opts_nsga3 <- create_pyDarwinOptions(
  author = "J. Doe",
  project_name = "MOGA_Test_NSGA3",
  algorithm = "MOGA3", # NSGA-III
  MOGA = moga_opts_nsga3_custom,
  population_size = 60, # NSGA-III population size might need adjustment
  num_generations = 100,
  postprocess = pyDarwinOptionsPostprocess( # Required for MOGA3
    use_r = TRUE,
    post_run_r_code = "{project_dir}/moga3_postprocess.R"
  ),
  engine_adapter = "nonmem",
  nmfe_path = "/opt/NONMEM/nm75/run/nmfe75"
)

Run the code above in your browser using DataLab