Learn R Programming

couplr

Optimal Pairing and Matching via Linear Assignment

The couplr package provides high-level functions for optimal one-to-one matching between two groups. Whether you need to pair treatment and control units, assign workers to tasks, or align images pixel-by-pixel, couplr offers fast, deterministic solutions with automatic preprocessing and balance diagnostics.

Quick Start

library(couplr)

# Match treatment and control groups on covariates
result <- match_couples(
  treated, control,
  vars = c("age", "income", "education"),
  auto_scale = TRUE
)

# Check covariate balance
balance_diagnostics(result, treated, control, vars = c("age", "income", "education"))

# Get analysis-ready dataset
matched_data <- join_matched(result, treated, control)

Statement of Need

Optimal matching is central to experimental design, causal inference, and resource allocation. Existing R packages (MatchIt, optmatch) focus on propensity score workflows, requiring users to estimate scores before matching. This adds complexity and can obscure the direct relationship between covariates and match quality.

This package addresses direct covariate matching: selecting optimal pairs based on observed variables without intermediate modeling. It provides:

  • 18 LAP algorithms for different problem sizes and structures,
  • automatic preprocessing with variable health checks,
  • balance diagnostics for assessing match quality,
  • analysis-ready joined output.

These features make the package useful in domains like:

  • causal inference (matching treated/control units),
  • experimental design (pairing samples for within-pair comparisons),
  • resource allocation (assigning workers to tasks),
  • image processing (pixel-level morphing and alignment).

Features

High-Level Matching Functions

  • match_couples(): Optimal one-to-one matching

    • Automatic preprocessing with variable health checks
    • Multiple scaling methods: robust (MAD), standardize (SD), range
    • Distance constraints via max_distance and calipers
    • Blocking support for stratified matching
  • greedy_couples(): Fast approximate matching

    • Three strategies: sorted, row_best, pq (priority queue)
    • 10-100x faster than optimal for large datasets
    • Same preprocessing and constraint options

Balance Diagnostics

  • balance_diagnostics(): Comprehensive balance assessment
    • Standardized differences, variance ratios, KS tests
    • Quality thresholds: <0.1 excellent, 0.1-0.25 good, 0.25-0.5 acceptable
    • Per-block statistics when blocking is used
    • Publication-ready tables via balance_table()

Low-Level LAP Solving

  • lap_solve(): Tidy interface for LAP algorithms

    • 18 solvers: Hungarian, Jonker-Volgenant, Auction, Network Simplex, etc.
    • Automatic method selection via method = "auto"
    • Supports rectangular matrices and forbidden assignments
  • lap_solve_batch(): Batch solving for multiple matrices

  • lap_solve_kbest(): K-best solutions via Murty's algorithm

Installation

# Install from CRAN
install.packages("couplr")

# Or install development version from GitHub
# install.packages("pak")
pak::pak("gcol33/couplr")

Usage Examples

Optimal Matching (match_couples)

library(couplr)

# Basic matching with automatic scaling
result <- match_couples(
  treated, control,
  vars = c("age", "income"),
  auto_scale = TRUE
)

# With distance constraint
result <- match_couples(
  treated, control,
  vars = c("age", "income"),
  auto_scale = TRUE,
  max_distance = 0.5
)

# With blocking (exact matching on site)
result <- match_couples(
  treated, control,
  vars = c("age", "income"),
  block_by = "site",
  auto_scale = TRUE
)

# Check what was matched
result$pairs

Greedy Matching (greedy_couples)

# Fast matching for large datasets
result <- greedy_couples(
  treated, control,
  vars = c("age", "income"),
  strategy = "row_best",
  auto_scale = TRUE
)

# Priority queue strategy (often best quality)
result <- greedy_couples(
  treated, control,
  vars = c("age", "income"),
  strategy = "pq"
)

Low-Level LAP Solving

# Solve a cost matrix
cost <- matrix(c(4, 2, 8, 4, 3, 7, 3, 1, 6), nrow = 3, byrow = TRUE)
result <- lap_solve(cost)
result$assignment
result$total_cost

# Choose a specific algorithm
result <- lap_solve(cost, method = "hungarian")

# K-best solutions
results <- lap_solve_kbest(cost, k = 3)

Choosing Between match_couples and greedy_couples

Featurematch_couples()greedy_couples()
OptimalityGuaranteed optimalApproximate
SpeedO(n^3)O(n^2) or better
Best forn < 5000n > 5000
Supports constraints?YesYes
Supports blocking?YesYes

Tip: Start with match_couples(). Switch to greedy_couples() if runtime is too long.

Advanced Features

Distance Caching

Precompute distances for rapid experimentation:

# Compute once
dist_obj <- compute_distances(treated, control, vars = c("age", "income"))

# Reuse with different constraints
result1 <- match_couples(dist_obj, max_distance = 0.3)
result2 <- match_couples(dist_obj, max_distance = 0.5)

Parallel Processing

Speed up blocked matching with multi-core processing:

result <- match_couples(
  treated, control,
  vars = c("age", "income"),
  block_by = "site",
  parallel = TRUE
)

Pixel Morphing

Align images pixel-by-pixel using optimal assignment:

morph <- pixel_morph(image_a, image_b)
pixel_morph_animate(morph, "output.gif")

Documentation

Support

"Software is like sex: it's better when it's free." — Linus Torvalds

I'm a PhD student who builds R packages in my free time because I believe good tools should be free and open. I started these projects for my own work and figured others might find them useful too.

If this package saved you some time, buying me a coffee is a nice way to say thanks. It helps with my coffee addiction.

License

MIT (see the LICENSE file)

Citation

@software{couplr,
  author = {Colling, Gilles},
  title = {couplr: Optimal Matching via Linear Assignment},
  year = {2026},
  url = {https://github.com/gcol33/couplr}
}

Copy Link

Version

Install

install.packages('couplr')

Version

1.0.10

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Gilles Colling

Last Published

January 21st, 2026

Functions in couplr (1.0.10)

compute_distances

Compute and Cache Distance Matrix for Reuse
check_variable_health

Check variable health for matching
check_cost_distribution

Check cost distribution for problems
count_valid_pairs

Count valid pairs in cost matrix
couplr_emoji

Get a themed emoji
compute_distance_matrix

Compute pairwise distance matrix
diagnose_distance_matrix

Diagnose distance matrix and suggest fixes
couplr_warn

Warn with a fun, themed warning message
couplr_success

Success message with emoji
err_missing_data

Missing data error
couplr_stop

Stop with a fun, themed error message
err_invalid_param

Invalid parameter error
err_missing_vars

Missing variables error
detect_blocking

Detect and validate blocking
err_no_valid_pairs

All pairs forbidden error
couplr-package

couplr: Optimal Pairing and Matching via Linear Assignment
couplr_messages

Couplr message helpers with emoji and humor
check_full_matching

Check if full matching was achieved
get_method_used

Extract method used from assignment result
get_total_cost

Extract total cost from assignment result
is_lap_solve_batch_result

Check if object is a batch assignment result
group_by

Re-export of dplyr::group_by
greedy_couples_from_distance

Greedy Matching from Precomputed Distance Object
greedy_blocks_parallel

Greedy match blocks in parallel
example_df

Example assignment problem data frame
info_low_match_rate

Low match rate info
example_costs

Example cost matrices for assignment problems
get_block_id_column

Standardize block ID column name
extract_ids

Extract and standardize IDs from data frames
is_distance_object

Check if Object is a Distance Object
is_lap_solve_kbest_result

Check if object is a k-best assignment result
greedy_couples_blocked

Greedy matching with blocking
couplr_inform

Info message with emoji
extract_matching_vars

Extract matching variables from data frame
filter_blocks

Filter blocks based on size and balance criteria
greedy_couples

Fast approximate matching using greedy algorithm
match_couples

Optimal matching using linear assignment
lap_solve

Solve linear assignment problems
has_valid_pairs

Check if any valid pairs exist
has_blocks

Check if data frame has blocking information
mark_forbidden_pairs

Mark forbidden pairs
greedy_couples_single

Greedy matching without blocking
hospital_staff

Hospital staff scheduling example dataset
lap_solve_kbest

Find k-best optimal assignments
lap_solve_batch

Solve multiple assignment problems efficiently
lap_solve_line_metric

Solve 1-D Line Assignment Problem
match_couples_blocked

Match with blocking (multiple problems)
match_blocks_parallel

Match blocks in parallel
is_lap_solve_result

Check if object is an assignment result
join_matched

Join Matched Pairs with Original Data
match_couples_from_distance

Match from Precomputed Distance Object
pixel_morph_animate

Pixel-level image morphing (animation)
plot.matching_result

Plot method for matching results
plot.balance_diagnostics

Plot method for balance diagnostics
pixel_morph

Pixel-level image morphing (final frame only)
matchmaker

Create blocks for stratified matching
print.variable_health

Print method for variable health
preprocess_matching_vars

Preprocess matching variables with automatic checks and scaling
print.balance_diagnostics

Print Method for Balance Diagnostics
print.matchmaker_result

Print method for matchmaker results
parallel_lapply

Parallel lapply using future
match_couples_single

Match without blocking (single problem)
%>%

Pipe operator
restore_parallel

Restore original parallel plan
summary.balance_diagnostics

Summary method for balance diagnostics
sinkhorn_to_assignment

Round 'Sinkhorn' transport plan to hard assignment
summary.distance_object

Summary Method for Distance Objects
summarize_blocks

Summarize block structure
print.lap_solve_batch_result

Print method for batch assignment results
print.lap_solve_kbest_result

Print method for k-best assignment results
print.matching_result

Print method for matching results
print.lap_solve_result

Print method for assignment results
standardized_difference

Calculate Standardized Difference
print.preprocessing_result

Print method for preprocessing result
sinkhorn

'Sinkhorn-Knopp' optimal transport solver
use_emoji

Check if emoji should be used
setup_parallel

Setup parallel processing with future
summary.lap_solve_kbest_result

Get summary of k-best results
validate_matching_inputs

Validate matching inputs
validate_cost_data

Validate and prepare cost data
validate_calipers

Validate calipers parameter
warn_poor_quality

High distance matches warning
warn_parallel_unavailable

Parallel package missing warning (reuse from matching_parallel.R)
warn_many_forbidden

Many forbidden pairs warning
warn_many_zeros

Too many zeros warning
warn_constant_distance

All distances identical warning
validate_weights

Validate weights parameter
summary.matching_result

Summary method for matching results
update_constraints

Update Constraints on Distance Object
print.distance_object

Print Method for Distance Objects
suggest_scaling

Suggest scaling method based on variable characteristics
success_good_balance

Perfect balance success message
warn_extreme_costs

Extreme cost ratio warning
warn_constant_var

Constant variable warning
apply_scaling

Apply scaling to matching variables
as_assignment_matrix

Convert assignment result to a binary matrix
assignment

Linear assignment solver
BIG_COST

Large value for forbidden pairs
assign_blocks_group

Assign blocks based on grouping variable(s)
apply_weights

Apply weights to matching variables
apply_calipers

Apply caliper constraints
apply_all_constraints

Apply all constraints to cost matrix
apply_max_distance

Apply maximum distance constraint
assign_blocks_cluster

Assign blocks using clustering
assignment_duals

Solve assignment problem and return dual variables
augment

Generic Augment Function
auto_encode_categorical

Automatically encode categorical variables
bottleneck_assignment

Solve the Bottleneck Assignment Problem
balance_table

Create Balance Table
balance_diagnostics

Balance Diagnostics for Matched Pairs
augment.matching_result

Augment Matching Results with Original Data (broom-style)
calculate_var_balance

Calculate Variable-Level Balance Statistics
can_parallelize

Check if parallel processing is available
build_cost_matrix

Build cost matrix for matching