Learn R Programming

NeuralEstimators (version 0.1.2)

train: Train a neural estimator

Description

The function caters for different variants of "on-the-fly" simulation. Specifically, a sampler can be provided to continuously sample new parameter vectors from the prior, and a simulator can be provided to continuously simulate new data conditional on the parameters. If provided with specific sets of parameters (theta_train and theta_val) and/or data (Z_train and Z_val), they will be held fixed during training.

Note that using R functions to perform "on-the-fly" simulation requires the user to have installed the Julia package RCall.

Usage

train(
  estimator,
  sampler = NULL,
  simulator = NULL,
  theta_train = NULL,
  theta_val = NULL,
  Z_train = NULL,
  Z_val = NULL,
  m = NULL,
  M = NULL,
  K = 10000,
  xi = NULL,
  loss = "absolute-error",
  learning_rate = 1e-04,
  epochs = 100,
  batchsize = 32,
  savepath = "",
  stopping_epochs = 5,
  epochs_per_Z_refresh = 1,
  epochs_per_theta_refresh = 1,
  simulate_just_in_time = FALSE,
  use_gpu = TRUE,
  verbose = TRUE
)

Value

a trained neural estimator or, if m is a vector, a list of trained neural estimators

Arguments

estimator

a neural estimator

sampler

a function that takes an integer K, samples K parameter vectors from the prior, and returns them as a pxK matrix

simulator

a function that takes a pxK matrix of parameters and an integer m, and returns K simulated data sets each containing m independent replicates

theta_train

a set of parameters used for updating the estimator using stochastic gradient descent

theta_val

a set of parameters used for monitoring the performance of the estimator during training

Z_train

a simulated data set used for updating the estimator using stochastic gradient descent

Z_val

a simulated data set used for monitoring the performance of the estimator during training

m

vector of sample sizes. If NULL (default), a single neural estimator is trained, with the sample size inferred from Z_val. If m is a vector of integers, a sequence of neural estimators is constructed for each sample size; see the Julia documentation for trainx() for further details

M

deprecated; use m

K

the number of parameter vectors sampled in the training set at each epoch; the size of the validation set is set to K/5.

xi

a list of objects used for data simulation (e.g., distance matrices); if it is provided, the parameter sampler is called as sampler(K, xi).

loss

the loss function: a string ('absolute-error' for mean-absolute-error loss or 'squared-error' for mean-squared-error loss), or a string of Julia code defining the loss function. For some classes of estimators (e.g., QuantileEstimator and RatioEstimator), the loss function does not need to be specified.

learning_rate

the learning rate for the optimiser ADAM (default 1e-3)

epochs

the number of epochs to train the neural network. An epoch is one complete pass through the entire training data set when doing stochastic gradient descent.

batchsize

the batchsize to use when performing stochastic gradient descent, that is, the number of training samples processed between each update of the neural-network parameters.

savepath

path to save the trained estimator and other information; if null (default), nothing is saved. Otherwise, the neural-network parameters (i.e., the weights and biases) will be saved during training as bson files; the risk function evaluated over the training and validation sets will also be saved, in the first and second columns of loss_per_epoch.csv, respectively; the best parameters (as measured by validation risk) will be saved as best_network.bson.

stopping_epochs

cease training if the risk doesn't improve in this number of epochs (default 5).

epochs_per_Z_refresh

integer indicating how often to refresh the training data

epochs_per_theta_refresh

integer indicating how often to refresh the training parameters; must be a multiple of epochs_per_Z_refresh

simulate_just_in_time

flag indicating whether we should simulate "just-in-time", in the sense that only a batchsize number of parameter vectors and corresponding data are in memory at a given time

use_gpu

a boolean indicating whether to use the GPU if one is available

verbose

a boolean indicating whether information, including empirical risk values and timings, should be printed to the console during training.

See Also

assess() for assessing an estimator post training, and estimate() for applying an estimator to observed data

Examples

Run this code
if (FALSE) {
# Construct a neural Bayes estimator for replicated univariate Gaussian 
# data with unknown mean and standard deviation. 

# Load R and Julia packages
library("NeuralEstimators")
library("JuliaConnectoR")
juliaEval("using NeuralEstimators, Flux, Distributions")

# Define the neural-network architecture
estimator <- juliaEval('
 d = 1    # dimension of each replicate
 p = 2    # number of parameters in the model
 w = 32   # width of each layer
 psi = Chain(Dense(d, w, relu), Dense(w, w, relu))
 phi = Chain(Dense(w, w, relu), Dense(w, p))
 deepset = DeepSet(psi, phi)
 estimator = PointEstimator(deepset)
')

# Sampler from the prior
sampler <- function(K) {
  mu    <- rnorm(K)      # Gaussian prior for the mean
  sigma <- rgamma(K, 1)  # Gamma prior for the standard deviation
  theta <- matrix(c(mu, sigma), byrow = TRUE, ncol = K)
  return(theta)
}

# Data simulator
simulator <- function(theta_set, m) {
  apply(theta_set, 2, function(theta) {
    t(rnorm(m, theta[1], theta[2]))
  }, simplify = FALSE)
}

# Train using fixed parameter and data sets 
theta_train <- sampler(10000)
theta_val   <- sampler(2000)
m <- 30 # number of iid replicates
Z_train <- simulator(theta_train, m)
Z_val   <- simulator(theta_val, m)
estimator <- train(estimator, 
                   theta_train = theta_train, 
                   theta_val = theta_val, 
                   Z_train = Z_train, 
                   Z_val = Z_val)
                   
# Train using simulation on-the-fly (requires Julia package RCall)
estimator <- train(estimator, sampler = sampler, simulator = simulator, m = m)

##### Simulation on-the-fly using Julia functions ####

# Defining the sampler and simulator in Julia can improve computational 
# efficiency by avoiding the overhead of communicating between R and Julia. 
# Julia is also fast (comparable to C) and so it can be useful to define 
# these functions in Julia when they involve for-loops. 

# Parameter sampler
sampler <- juliaEval("
      function sampler(K)
      	mu = rand(Normal(0, 1), K)
      	sigma = rand(Gamma(1), K)
      	theta = hcat(mu, sigma)'
      	return theta
      end")

# Data simulator
simulator <- juliaEval("
      function simulator(theta_matrix, m)
      	Z = [rand(Normal(theta[1], theta[2]), 1, m) for theta in eachcol(theta_matrix)]
      	return Z
      end")

# Train the estimator
estimator <- train(estimator, sampler = sampler, simulator = simulator, m = m)}

Run the code above in your browser using DataLab