Learn R Programming

vecmatch (version 1.2.0)

make_opt_args: Define the Optimization Parameter Space for Matching

Description

make_opt_args() creates an object of class "opt_args" that defines the parameter search space for optimize_gps().

The function accepts vectors of values for each customizable argument involved in GPS estimation and matching. It computes the Cartesian product of all parameter combinations, which serves as the input search space for the random search algorithm used by optimize_gps().

To ensure valid optimization, the data and formula arguments must exactly match those passed to optimize_gps().

Usage

make_opt_args(
  data = NULL,
  formula,
  reference = NULL,
  gps_method = paste0("m", 1:10),
  matching_method = c("fullopt", "nnm"),
  caliper = seq(0.01, 10, 0.01),
  order = c("desc", "asc", "original", "random"),
  cluster = 2,
  replace = c(TRUE, FALSE),
  ties = c(TRUE, FALSE),
  ratio = 1,
  min_controls = 1,
  max_controls = 1
)

Value

An object of class "opt_args", containing all valid parameter combinations to be sampled by optimize_gps(). Use print() to explore the defined search space.

Arguments

data

A data.frame containing all variables referenced in formula. Must match the dataset used in optimize_gps().

formula

A valid formula specifying the treatment variable (left-hand side) and covariates (right-hand side). Interaction terms can be included using *. Must match the formula used in optimize_gps().

reference

A single string or vector of treatment group levels to be used as the reference (baseline) group in both GPS estimation and matching.

gps_method

A string or vector of strings specifying GPS estimation methods. Allowed values are "m1" to "m10". See Details below.

matching_method

A string or vector of strings specifying the matching method(s) to evaluate. Currently supported options are "nnm" and "fullopt". See match_gps().

caliper

A numeric value or vector of values specifying caliper widths (i.e., maximum allowed GPS distance for matching). Same as in match_gps(), but allows multiple values.

order

A string or vector of strings indicating the sorting order of logit-transformed GPS values before matching. Options are:

  • "desc": sort from highest to lowest (default),

  • "asc": sort from lowest to highest,

  • "original": keep original order,

  • "random": randomize order (use set.seed() for reproducibility).

cluster

An integer or vector of integers specifying the number of clusters for k-means clustering (if applicable).

replace

Logical value or vector of logicals indicating whether to allow matching with replacement. Same meaning as in match_gps(), but supports multiple settings.

ties

Logical value or vector of logicals defining how ties should be handled during nearest-neighbor matching.

ratio

A numeric value or vector specifying the ratio of control to treated units for matching (used in "nnm").

min_controls

A scalar or vector specifying the minimum number of controls to be matched to each treated unit (used in "fullopt").

max_controls

A scalar or vector specifying the maximum number of controls to be matched to each treated unit (used in "fullopt").

Details

The returned object is of class "opt_args" and is intended to be passed directly to optimize_gps(). Internally, the function calculates the full Cartesian product of all supplied parameter values and validates the structure of each.

The gps_method argument must contain one or more of the following codes:

| gps_method |      Method      |       Link Function         |
|------------|------------------|-----------------------------|
|    "m1"    |    multinom      |   generalized_logit         |
|    "m2"    |     polr         |   logistic                  |
|    "m3"    |     polr         |   probit                    |
|    "m4"    |     polr         |   loglog                    |
|    "m5"    |     polr         |   cloglog                   |
|    "m6"    |     polr         |   cauchit                   |
|    "m7"    |     vglm         |   multinomial_logit         |
|    "m8"    |     vglm         |   reduced_rank_ml           |
|    "m9"    |    brglm2        |   baseline_category_logit   |
|   "m10"    |    mblogit       |   baseline_category_logit   |

The object includes a custom S3 print() method that displays:

  • A summary table of all allowed values for each optimization parameter,

  • The total number of unique parameter combinations (i.e., the size of the search space).

See Also

optimize_gps(), match_gps(), estimate_gps()

Examples

Run this code
# Define formula and dataset
formula_cancer <- formula(status ~ age * sex)

# Create search space with multiple values for GPS and matching
opt_args <- make_opt_args(
  data = cancer,
  formula = formula_cancer,
  gps_method = c("m1", "m2", "m9"),
  matching_method = c("nnm", "fullopt"),
  caliper = c(0.1, 0.2),
  order = c("desc", "random"),
  reference = "control"
)

# Print summary of the search space
print(opt_args)

Run the code above in your browser using DataLab