Learn R Programming

mlr3resampling (version 2025.11.19)

proj_submit: Compute several resampling jobs

Description

proj_todo determines which jobs remain to be computed. proj_compute_all computes all remaining jobs sequentially, whereas proj_compute_mpi computes them in parallel using MPI (should be run in an R session activated by mpirun or srun). proj_submit is a non-blocking call to SLURM sbatch, asking for a single job with several tasks that run proj_compute_mpi.

Usage

proj_todo(proj_dir)
proj_compute_mpi(proj_dir, verbose=FALSE)
proj_compute_all(proj_dir, verbose=FALSE)
proj_submit(
  proj_dir, tasks = 2, hours = 1, gigabytes = 1,
  verbose = FALSE)

Value

proj_submit returns the ID of the submitted SLURM job.

proj_compute_all and proj_compute_mpi return a data table of results computed.

proj_todo returns a vector of job IDs not yet computed.

Arguments

proj_dir

Project directory created via proj_grid.

tasks

Positive integer: ntasks parameter for SLURM scheduler, one for manager, others are workers.

hours

Hours of walltime to ask the SLURM scheduler.

gigabytes

Gigabytes of memory to ask the SLURM scheduler.

verbose

Logical: print messages?

Author

Toby Dylan Hocking

Details

This is Step 2 out of the typical 3 step pipeline (init grid, submit, read results).

Examples

Run this code

N <- 80
library(data.table)
set.seed(1)
reg.dt <- data.table(
  x=runif(N, -2, 2),
  person=factor(rep(c("Alice","Bob"), each=0.5*N)))
reg.pattern.list <- list(
  easy=function(x, person)x^2,
  impossible=function(x, person)(x^2)*(-1)^as.integer(person))
SOAK <- mlr3resampling::ResamplingSameOtherSizesCV$new()
reg.task.list <- list()
for(pattern in names(reg.pattern.list)){
  f <- reg.pattern.list[[pattern]]
  task.dt <- data.table(reg.dt)[
  , y := f(x,person)+rnorm(N, sd=0.5)
  ][]
  task.obj <- mlr3::TaskRegr$new(
    pattern, task.dt, target="y")
  task.obj$col_roles$feature <- "x"
  task.obj$col_roles$stratum <- "person"
  task.obj$col_roles$subset <- "person"
  reg.task.list[[pattern]] <- task.obj
}
reg.learner.list <- list(
  featureless=mlr3::LearnerRegrFeatureless$new())
if(requireNamespace("rpart")){
  reg.learner.list$rpart <- mlr3::LearnerRegrRpart$new()
}

pkg.proj.dir <- tempfile()
mlr3resampling::proj_grid(
  pkg.proj.dir,
  reg.task.list,
  reg.learner.list,
  SOAK,
  order_jobs = function(DT)1:2, # for CRAN.
  score_args=mlr3::msrs(c("regr.rmse", "regr.mae")))
mlr3resampling::proj_compute_all(pkg.proj.dir)

Run the code above in your browser using DataLab