estimate_process_parameters: Estimate point process parameters using log-likelihood maximization

Description

Estimate spatio-temporal point process parameters by maximizing the (approximate) full log-likelihood using nloptr. For the self-correcting process, the arrival times must be on \((0,1)\) and can either be supplied directly in data as time, or constructed from size via the gentle-decay (power-law) mapping power_law_mapping using delta (single fit) or delta_values (delta search).

Usage

estimate_process_parameters(
  data,
  process = c("self_correcting"),
  x_grid = NULL,
  y_grid = NULL,
  t_grid = NULL,
  upper_bounds = NULL,
  parameter_inits = NULL,
  delta = NULL,
  delta_values = NULL,
  parallel = FALSE,
  num_cores = max(1L, parallel::detectCores() - 1L),
  set_future_plan = FALSE,
  strategy = c("local", "global_local", "multires_global_local"),
  grid_levels = NULL,
  refine_best_delta = TRUE,
  global_algorithm = "NLOPT_GN_CRS2_LM",
  local_algorithm = "NLOPT_LN_BOBYQA",
  global_options = list(maxeval = 150),
  local_options = list(maxeval = 300, xtol_rel = 1e-05, maxtime = NULL),
  n_starts = 1L,
  jitter_sd = 0.35,
  seed = 1L,
  finite_bounds = NULL,
  verbose = TRUE
)

Value

An object of class "ldmppr_fit" containing the best nloptr fit and (optionally) all fits from a delta search.

Arguments

data: A data.frame or matrix. Must contain either columns (time, x, y) or (x, y, size). If a matrix is provided for delta search, it must have column names c("x","y","size").
process: Character string specifying the process model. Currently supports "self_correcting".
x_grid, y_grid, t_grid: Numeric vectors defining the integration grid for \((x,y,t)\).
upper_bounds: Numeric vector of length 3 giving c(b_t, b_x, b_y).
parameter_inits: Numeric vector of length 8 giving initialization values for the model parameters.
delta: Optional numeric scalar used only when data contains (x,y,size) but not time.
delta_values: Optional numeric vector. If supplied, the function fits the model for each value of delta_values (mapping size -> time via power_law_mapping) and returns the best fit (lowest objective).
parallel: logical. If TRUE, uses furrr/future to parallelize either (a) over `delta_values` (when provided) or (b) over multi-start initializations (when `delta_values` is NULL and `n_starts > 1`).
num_cores: Integer number of workers to use when set_future_plan = TRUE.
set_future_plan: Logical. If TRUE, temporarily sets future::plan(multisession, workers = num_cores) and restores the original plan on exit.
strategy: Character string specifying the estimation strategy: - "local": single-level local optimization from parameter_inits. - "global_local": single-level global optimization (from parameter_inits) followed by local polish. - "multires_global_local": multi-resolution fitting over grid_levels (coarsest level uses global + local; finer levels use local polish only).
grid_levels: Optional list defining the multi-resolution grid schedule when strategy = "multires_global_local". Each entry can be a numeric vector c(nx, ny, nt) or a list with named entries list(nx=..., ny=..., nt=...). If NULL, uses the supplied (x_grid, y_grid, t_grid) as a single level.
refine_best_delta: Logical. If TRUE and delta_values is supplied, performs a final refinement fit at the best delta found using the full multi-resolution strategy.
global_algorithm, local_algorithm: Character strings specifying the NLopt algorithms to use for the global and local optimization stages, respectively.
global_options, local_options: Named lists of options to pass to nloptr::nloptr() for the global and local optimization stages, respectively.
n_starts: Integer number of multi-start initializations to use for the local optimization stage.
jitter_sd: Numeric standard deviation used to jitter the multi-start initializations.
seed: Integer random seed used for multi-start initialization jittering.
finite_bounds: Optional list with components lb and ub giving finite lower and upper bounds for all 8 parameters. Used only when the selected optimization algorithms require finite bounds.
verbose: Logical. If TRUE, prints progress messages during fitting.

Details

For the self-correcting process, the log-likelihood integral is approximated using the supplied grid (x_grid, y_grid, t_grid) over the bounded domain upper_bounds. When delta_values is supplied, this function performs a grid search over delta values, fitting the model separately for each mapped dataset and selecting the best objective value.

References

Møller, J., Ghorbani, M., & Rubak, E. (2016). Mechanistic spatio-temporal point process models for marked point processes, with a view to forest stand data. Biometrics, 72(3), 687–696. tools:::Rd_expr_doi("10.1111/biom.12466").

Examples

Run this code

data(small_example_data)

# Build time using a single delta (so data has time,x,y)
small_txy <- small_example_data %>%
  dplyr::mutate(time = power_law_mapping(size, 0.5)) %>%
  dplyr::select(time, x, y)

x_grid <- seq(0, 25, length.out = 5)
y_grid <- seq(0, 25, length.out = 5)
t_grid <- seq(0, 1,  length.out = 5)

parameter_inits <- c(1.5, 8.5, .015, 1.5, 3.2, .75, 3, .08)
upper_bounds <- c(1, 25, 25)

fit <- estimate_process_parameters(
  data = small_txy,
  process = "self_correcting",
  x_grid = x_grid,
  y_grid = y_grid,
  t_grid = t_grid,
  upper_bounds = upper_bounds,
  parameter_inits = parameter_inits,
  strategy = "global_local",
  global_algorithm = "NLOPT_GN_CRS2_LM",
  local_algorithm = "NLOPT_LN_BOBYQA",
  global_options = list(maxeval = 150),
  local_options = list(maxeval = 25, xtol_rel = 1e-2),
  verbose = TRUE
)

coef(fit)
logLik(fit)

# \donttest{
# Delta-search example (data has x,y,size; time is derived internally for each delta)
fit_delta <- estimate_process_parameters(
  data = small_example_data, # x,y,size
  process = "self_correcting",
  x_grid = x_grid,
  y_grid = y_grid,
  t_grid = t_grid,
  upper_bounds = upper_bounds,
  parameter_inits = parameter_inits,
  delta_values = c(0.35, 0.5, 0.65, 0.9, 1.0),
  parallel = TRUE,
  set_future_plan = TRUE,
  num_cores = 2,
  strategy = "multires_global_local",
  global_options = list(maxeval = 100),
  local_options  = list(maxeval = 100, xtol_rel = 1e-3),
  n_starts = 3,
  refine_best_delta = TRUE,
  verbose = TRUE
)
plot(fit_delta)
# }

Run the code above in your browser using DataLab