Learn R Programming

couplr

Optimal Pairing and Matching via Linear Assignment

The couplr package provides high-level functions for optimal one-to-one matching between two groups. Whether you need to pair treatment and control units, assign workers to tasks, or align images pixel-by-pixel, couplr offers fast, deterministic solutions with automatic preprocessing and balance diagnostics.

Quick Start

library(couplr)

# Match treatment and control groups on covariates
result <- match_couples(
  treated, control,
  vars = c("age", "income", "education"),
  auto_scale = TRUE
)

# Check covariate balance
balance_diagnostics(result, treated, control, vars = c("age", "income", "education"))

# Get analysis-ready dataset
matched_data <- join_matched(result, treated, control)

Statement of Need

Optimal matching is central to experimental design, causal inference, and resource allocation. Existing R packages (MatchIt, optmatch) focus on propensity score workflows, requiring users to estimate scores before matching. This adds complexity and can obscure the direct relationship between covariates and match quality.

This package addresses direct covariate matching: selecting optimal pairs based on observed variables without intermediate modeling. It provides:

  • 20 LAP algorithms for different problem sizes and structures,
  • automatic preprocessing with variable health checks,
  • balance diagnostics for assessing match quality,
  • analysis-ready joined output.

These features make the package useful in domains like:

  • causal inference (matching treated/control units),
  • experimental design (pairing samples for within-pair comparisons),
  • resource allocation (assigning workers to tasks),
  • image processing (pixel-level morphing and alignment).

Features

High-Level Matching Functions

  • match_couples(): Optimal one-to-one matching

    • Automatic preprocessing with variable health checks
    • Multiple scaling methods: robust (MAD), standardize (SD), range
    • Distance constraints via max_distance and calipers
    • Blocking support for stratified matching
  • greedy_couples(): Fast approximate matching

    • Three strategies: sorted, row_best, pq (priority queue)
    • 10-100x faster than optimal for large datasets
    • Same preprocessing and constraint options

Balance Diagnostics

  • balance_diagnostics(): Comprehensive balance assessment
    • Standardized differences, variance ratios, KS tests
    • Quality thresholds: <0.1 excellent, 0.1-0.25 good, 0.25-0.5 acceptable
    • Per-block statistics when blocking is used
    • Publication-ready tables via balance_table()

Low-Level LAP Solving

  • lap_solve(): Tidy interface for LAP algorithms

    • 20 solvers: Hungarian, Jonker-Volgenant, Auction, Network Simplex, etc.
    • Automatic method selection via method = "auto"
    • Supports rectangular matrices and forbidden assignments
  • lap_solve_batch(): Batch solving for multiple matrices

  • lap_solve_kbest(): K-best solutions via Murty's algorithm

Installation

# Install from CRAN
install.packages("couplr")

# Or install development version from GitHub
# install.packages("pak")
pak::pak("gcol33/couplr")

Usage Examples

Optimal Matching (match_couples)

library(couplr)

# Basic matching with automatic scaling
result <- match_couples(
  treated, control,
  vars = c("age", "income"),
  auto_scale = TRUE
)

# With distance constraint
result <- match_couples(
  treated, control,
  vars = c("age", "income"),
  auto_scale = TRUE,
  max_distance = 0.5
)

# With blocking (exact matching on site)
result <- match_couples(
  treated, control,
  vars = c("age", "income"),
  block_by = "site",
  auto_scale = TRUE
)

# Check what was matched
result$pairs

Greedy Matching (greedy_couples)

# Fast matching for large datasets
result <- greedy_couples(
  treated, control,
  vars = c("age", "income"),
  strategy = "row_best",
  auto_scale = TRUE
)

# Priority queue strategy (often best quality)
result <- greedy_couples(
  treated, control,
  vars = c("age", "income"),
  strategy = "pq"
)

Low-Level LAP Solving

# Solve a cost matrix
cost <- matrix(c(4, 2, 8, 4, 3, 7, 3, 1, 6), nrow = 3, byrow = TRUE)
result <- lap_solve(cost)
result$assignment
result$total_cost

# Choose a specific algorithm
result <- lap_solve(cost, method = "hungarian")

# K-best solutions
results <- lap_solve_kbest(cost, k = 3)

Choosing Between match_couples and greedy_couples

Featurematch_couples()greedy_couples()
OptimalityGuaranteed optimalApproximate
SpeedO(n^3)O(n^2) or better
Best forn < 5000n > 5000
Supports constraints?YesYes
Supports blocking?YesYes

Tip: Start with match_couples(). Switch to greedy_couples() if runtime is too long.

Advanced Features

Distance Caching

Precompute distances for rapid experimentation:

# Compute once
dist_obj <- compute_distances(treated, control, vars = c("age", "income"))

# Reuse with different constraints
result1 <- match_couples(dist_obj, max_distance = 0.3)
result2 <- match_couples(dist_obj, max_distance = 0.5)

Parallel Processing

Speed up blocked matching with multi-core processing:

result <- match_couples(
  treated, control,
  vars = c("age", "income"),
  block_by = "site",
  parallel = TRUE
)

Pixel Morphing

Align images pixel-by-pixel using optimal assignment:

morph <- pixel_morph(image_a, image_b)
pixel_morph_animate(morph, "output.gif")

Documentation

Support

"Software is like sex: it's better when it's free." — Linus Torvalds

I'm a PhD student who builds R packages in my free time because I believe good tools should be free and open. I started these projects for my own work and figured others might find them useful too.

If this package saved you some time, buying me a coffee is a nice way to say thanks. It helps with my coffee addiction.

License

MIT (see the LICENSE file)

Citation

@software{couplr,
  author = {Colling, Gilles},
  title = {couplr: Optimal Matching via Linear Assignment},
  year = {2026},
  url = {https://github.com/gcol33/couplr}
}

Copy Link

Version

Install

install.packages('couplr')

Version

1.1.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Gilles Colling

Last Published

March 3rd, 2026

Functions in couplr (1.1.0)

can_parallelize

Check if parallel processing is available
build_cost_matrix

Build cost matrix for matching
cardinality_match

Cardinality Matching
augment

Generic Augment Function
.pair_var_diffs

Compute per-pair differences on a single variable
couplr-package

couplr: Optimal Pairing and Matching via Linear Assignment
.couples_single

Shared single matching implementation
example_df

Example assignment problem data frame
compute_distances

Compute and Cache Distance Matrix for Reuse
.rosenbaum_bounds

Compute Rosenbaum bounds via normal approximation
calculate_var_balance

Calculate Variable-Level Balance Statistics
couplr_inform

Info message with emoji
detect_blocking

Detect and validate blocking
check_cost_distribution

Check cost distribution for problems
greedy_couples

Fast approximate matching using greedy algorithm
check_full_matching

Check if full matching was achieved
couplr_warn

Warn with a fun, themed warning message
couplr_emoji

Get a themed emoji
count_valid_pairs

Count valid pairs in cost matrix
check_variable_health

Check variable health for matching
extract_ids

Extract and standardize IDs from data frames
compute_distance_matrix

Compute pairwise distance matrix
couplr_stop

Stop with a fun, themed error message
couplr_messages

Couplr message helpers with emoji and humor
get_total_cost

Extract total cost from assignment result
couplr_success

Success message with emoji
err_invalid_param

Invalid parameter error
hospital_staff

Hospital staff scheduling example dataset
extract_matching_vars

Extract matching variables from data frame
.couples_blocked

Shared blocked matching implementation
is_lap_solve_result

Check if object is an assignment result
greedy_couples_blocked

Greedy matching with blocking
get_block_id_column

Standardize block ID column name
err_no_valid_pairs

All pairs forbidden error
filter_blocks

Filter blocks based on size and balance criteria
example_costs

Example cost matrices for assignment problems
.couples_replace

Replacement matching: each left picks its best right independently
.couples_from_distance

Shared matching from precomputed distance object
lap_solve

Solve linear assignment problems
get_method_used

Extract method used from assignment result
.couples_ratio

k:1 matching via cost matrix expansion
lap_solve_batch

Solve multiple assignment problems efficiently
greedy_couples_from_distance

Greedy Matching from Precomputed Distance Object
is_distance_object

Check if Object is a Distance Object
greedy_couples_single

Greedy matching without blocking
info_low_match_rate

Low match rate info
group_by

Re-export of dplyr::group_by
sensitivity_analysis

Rosenbaum Sensitivity Analysis
pixel_morph

Pixel-level image morphing (final frame only)
join_matched

Join Matched Pairs with Original Data
pixel_morph_animate

Pixel-level image morphing (animation)
match_couples_from_distance

Match from Precomputed Distance Object
match_couples_single

Match without blocking (single problem)
sinkhorn_to_assignment

Round 'Sinkhorn' transport plan to hard assignment
is_lap_solve_batch_result

Check if object is a batch assignment result
greedy_blocks_parallel

Greedy match blocks in parallel
has_blocks

Check if data frame has blocking information
setup_parallel

Setup parallel processing with future
has_valid_pairs

Check if any valid pairs exist
print.variable_health

Print method for variable health
match_blocks_parallel

Match blocks in parallel
.autoplot_variance

Variance ratio plot via ggplot2
.autoplot_love

Love plot via ggplot2
diagnose_distance_matrix

Diagnose distance matrix and suggest fixes
summary.balance_diagnostics

Summary method for balance diagnostics
summary.distance_object

Summary Method for Distance Objects
warn_parallel_unavailable

Parallel package missing warning (reuse from matching_parallel.R)
success_good_balance

Perfect balance success message
standardized_difference

Calculate Standardized Difference
mark_forbidden_pairs

Mark forbidden pairs
validate_matching_inputs

Validate matching inputs
validate_cost_data

Validate and prepare cost data
plot.sensitivity_analysis

Plot method for sensitivity analysis (base graphics)
update_constraints

Update Constraints on Distance Object
summary.sensitivity_analysis

Summary method for sensitivity analysis
print.sensitivity_analysis

Print method for sensitivity analysis
preprocess_matching_vars

Preprocess matching variables with automatic checks and scaling
.autoplot_hist

Histogram of |std diff| via ggplot2
plot.matching_result

Plot method for matching results
is_lap_solve_kbest_result

Check if object is a k-best assignment result
matchmaker

Create blocks for stratified matching
err_missing_vars

Missing variables error
warn_poor_quality

High distance matches warning
parallel_lapply

Parallel lapply using future
plot.balance_diagnostics

Plot method for balance diagnostics
warn_many_zeros

Too many zeros warning
use_emoji

Check if emoji should be used
warn_many_forbidden

Many forbidden pairs warning
validate_calipers

Validate calipers parameter
lap_solve_line_metric

Solve 1-D Line Assignment Problem
print.balance_diagnostics

Print Method for Balance Diagnostics
lap_solve_kbest

Find k-best optimal assignments
print.distance_object

Print Method for Distance Objects
.compute_pair_balance

Compute standardized differences for current pairs
sinkhorn

'Sinkhorn-Knopp' optimal transport solver
ps_match

Propensity Score Matching
.blocks_parallel

Shared parallel block matching implementation
restore_parallel

Restore original parallel plan
print.lap_solve_result

Print method for assignment results
print.matching_result

Print method for matching results
warn_extreme_costs

Extreme cost ratio warning
summary.lap_solve_kbest_result

Get summary of k-best results
print.lap_solve_batch_result

Print method for batch assignment results
summary.matching_result

Summary method for matching results
match_couples

Optimal matching using linear assignment
print.matchmaker_result

Print method for matchmaker results
match_couples_blocked

Match with blocking (multiple problems)
err_missing_data

Missing data error
print.lap_solve_kbest_result

Print method for k-best assignment results
warn_constant_var

Constant variable warning
validate_weights

Validate weights parameter
print.preprocessing_result

Print method for preprocessing result
warn_constant_distance

All distances identical warning
summarize_blocks

Summarize block structure
suggest_scaling

Suggest scaling method based on variable characteristics
as_assignment_matrix

Convert assignment result to a binary matrix
apply_max_distance

Apply maximum distance constraint
assignment

Linear assignment solver
augment.matching_result

Augment Matching Results with Original Data (broom-style)
assignment_duals

Solve assignment problem and return dual variables
autoplot.matching_result

ggplot2 autoplot for matching results
balance_diagnostics

Balance Diagnostics for Matched Pairs
apply_all_constraints

Apply all constraints to cost matrix
auto_encode_categorical

Automatically encode categorical variables
apply_scaling

Apply scaling to matching variables
apply_weights

Apply weights to matching variables
assign_blocks_cluster

Assign blocks using clustering
BIG_COST

Large value for forbidden pairs
assign_blocks_group

Assign blocks based on grouping variable(s)
apply_calipers

Apply caliper constraints
balance_table

Create Balance Table
bottleneck_assignment

Solve the Bottleneck Assignment Problem
autoplot.sensitivity_analysis

ggplot2 autoplot for sensitivity analysis
autoplot.balance_diagnostics

ggplot2 autoplot for balance diagnostics