The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Title: Optimal Pairing and Matching via Linear Assignment
Version: 1.0.6
Description: Solves optimal pairing and matching problems using linear assignment algorithms. Provides implementations of the Hungarian method (Kuhn 1955) <doi:10.1002/nav.3800020109>, Jonker-Volgenant shortest path algorithm (Jonker and Volgenant 1987) <doi:10.1007/BF02278710>, Auction algorithm (Bertsekas 1988) <doi:10.1007/BF02186476>, cost-scaling (Goldberg and Kennedy 1995) <doi:10.1007/BF01585996>, scaling algorithms (Gabow and Tarjan 1989) <doi:10.1137/0218069>, push-relabel (Goldberg and Tarjan 1988) <doi:10.1145/48014.61051>, and Sinkhorn entropy-regularized transport (Cuturi 2013) <doi:10.48550/arxiv.1306.0895>. Designed for matching plots, sites, samples, or any pairwise optimization problem. Supports rectangular matrices, forbidden assignments, data frame inputs, batch solving, k-best solutions, and pixel-level image morphing for visualization. Includes automatic preprocessing with variable health checks, multiple scaling methods (standardized, range, robust), greedy matching algorithms, and comprehensive balance diagnostics for assessing match quality using standardized differences and distribution comparisons.
License: MIT + file LICENSE
Language: en-US
Encoding: UTF-8
RoxygenNote: 7.3.3
Depends: R (≥ 4.1.0)
Imports: Rcpp (≥ 1.0.0), tibble (≥ 3.0.0), dplyr (≥ 1.0.0), rlang (≥ 0.4.0), purrr (≥ 0.3.0), magrittr (≥ 2.0.0), methods
Suggests: testthat (≥ 3.0.0), xml2, e1071, R.utils, microbenchmark, withr, knitr, rmarkdown, bench, parallel, future (≥ 1.20.0), future.apply (≥ 1.8.0), ggplot2, ggraph, tidygraph, magick, OpenImageR, farver, av, reticulate, png, combinat
LinkingTo: Rcpp, RcppEigen, testthat
SystemRequirements: C++17
LazyData: true
VignetteBuilder: knitr
URL: https://gillescolling.com/couplr/, https://github.com/gcol33/couplr
BugReports: https://github.com/gcol33/couplr/issues
Config/testthat/edition: 3
Config/testthat/parallel: true
NeedsCompilation: yes
Packaged: 2026-01-14 21:35:36 UTC; Gilles Colling
Author: Gilles Colling [aut, cre, cph]
Maintainer: Gilles Colling <gilles.colling051@gmail.com>
Repository: CRAN
Date/Publication: 2026-01-20 10:30:13 UTC

couplr: Optimal Pairing and Matching via Linear Assignment

Description

Solves optimal pairing and matching problems using linear assignment algorithms. Provides implementations of the Hungarian method (Kuhn 1955) doi:10.1002/nav.3800020109, Jonker-Volgenant shortest path algorithm (Jonker and Volgenant 1987) doi:10.1007/BF02278710, Auction algorithm (Bertsekas 1988) doi:10.1007/BF02186476, cost-scaling (Goldberg and Kennedy 1995) doi:10.1007/BF01585996, scaling algorithms (Gabow and Tarjan 1989) doi:10.1137/0218069, push-relabel (Goldberg and Tarjan 1988) doi:10.1145/48014.61051, and Sinkhorn entropy-regularized transport (Cuturi 2013) doi:10.48550/arxiv.1306.0895. Designed for matching plots, sites, samples, or any pairwise optimization problem. Supports rectangular matrices, forbidden assignments, data frame inputs, batch solving, k-best solutions, and pixel-level image morphing for visualization. Includes automatic preprocessing with variable health checks, multiple scaling methods (standardized, range, robust), greedy matching algorithms, and comprehensive balance diagnostics for assessing match quality using standardized differences and distribution comparisons.

Solves optimal pairing and matching problems using linear assignment algorithms. Designed for matching plots, sites, samples, or any pairwise optimization problem. Provides modern, tidy implementations of 'Hungarian', 'Jonker-Volgenant', 'Auction', and other LAP solvers.

Main functions

Author(s)

Maintainer: Gilles Colling gilles.colling051@gmail.com [copyright holder]

See Also

Useful links:


Pipe operator

Description

See magrittr::%>% for details.

Usage

lhs %>% rhs

Arguments

lhs

A value or the magrittr placeholder.

rhs

A function call using the magrittr semantics.

Value

The result of calling rhs(lhs).


Large value for forbidden pairs

Description

A numeric constant used to mark forbidden pairs in cost matrices.

Usage

BIG_COST

Format

Numeric value (half of .Machine$double.xmax).


Apply all constraints to cost matrix

Description

Main entry point for applying constraints.

Usage

apply_all_constraints(
  cost_matrix,
  left,
  right,
  vars,
  max_distance = Inf,
  calipers = NULL,
  forbidden = NULL
)

Value

Modified cost matrix with all constraints applied.


Apply caliper constraints

Description

Calipers impose per-variable maximum absolute differences.

Usage

apply_calipers(cost_matrix, left, right, calipers, vars)

Value

Modified cost matrix with forbidden pairs marked.


Apply maximum distance constraint

Description

Apply maximum distance constraint

Usage

apply_max_distance(cost_matrix, max_distance = Inf)

Value

Modified cost matrix with forbidden pairs marked.


Apply scaling to matching variables

Description

Apply scaling to matching variables

Usage

apply_scaling(left_mat, right_mat, method = "standardize")

Value

List with scaled left/right matrices and scaling parameters.


Apply weights to matching variables

Description

Apply weights to matching variables

Usage

apply_weights(mat, weights)

Value

Numeric matrix with columns weighted.


Convert assignment result to a binary matrix

Description

Turns a tidy assignment result back into a 0/1 assignment matrix.

Usage

as_assignment_matrix(x, n_sources = NULL, n_targets = NULL)

Arguments

x

An assignment result object of class lap_solve_result

n_sources

Number of source nodes, optional

n_targets

Number of target nodes, optional

Value

Integer matrix with 0 and 1 entries


Assign blocks using clustering

Description

Assign blocks using clustering

Usage

assign_blocks_cluster(left, right, block_vars, method, n_blocks, ...)

Value

List with modified left/right data frames (with block_id) and n_blocks_initial.


Assign blocks based on grouping variable(s)

Description

Assign blocks based on grouping variable(s)

Usage

assign_blocks_group(left, right, block_by)

Value

List with modified left/right data frames (with block_id) and n_blocks_initial.


Linear assignment solver

Description

Solve the linear assignment problem (minimum- or maximum-cost matching) using several algorithms. Forbidden edges can be marked as NA or Inf.

Usage

assignment(
  cost,
  maximize = FALSE,
  method = c("auto", "jv", "hungarian", "auction", "auction_gs", "auction_scaled", "sap",
    "ssp", "csflow", "hk01", "bruteforce", "ssap_bucket", "cycle_cancel", "gabow_tarjan",
    "lapmod", "csa", "ramshaw_tarjan", "push_relabel", "orlin", "network_simplex"),
  auction_eps = NULL,
  eps = NULL
)

Arguments

cost

Numeric matrix; rows = tasks, columns = agents. NA or Inf entries are treated as forbidden assignments.

maximize

Logical; if TRUE, maximizes the total cost instead of minimizing.

method

Character string indicating the algorithm to use. Options:

General-purpose solvers:

  • "auto" — Automatic selection based on problem characteristics (default)

  • "jv" — 'Jonker-Volgenant', fast general-purpose O(n³)

  • "hungarian" — Classic 'Hungarian' algorithm O(n³)

Auction-based solvers:

  • "auction" — 'Bertsekas' auction with adaptive epsilon

  • "auction_gs" — 'Gauss-Seidel' variant, good for spatial structure

  • "auction_scaled" — 'Epsilon-scaling', fastest for large dense problems

Specialized solvers:

  • "sap" / "ssp" — Shortest augmenting path, handles sparsity well

  • "lapmod" — Sparse JV variant, faster when >50\

  • "hk01" — 'Hopcroft-Karp' for binary (0/1) costs only

  • "ssap_bucket" — 'Dial' algorithm for integer costs

  • "line_metric" — O(n log n) for 1D assignment problems

  • "bruteforce" — Exact enumeration for tiny problems (n <= 8)

Advanced solvers:

  • "csa" — 'Goldberg-Kennedy' cost-scaling, often fastest for medium-large

  • "gabow_tarjan" — 'Gabow-Tarjan' bit-scaling with complementary slackness O(n³ log C)

  • "cycle_cancel" — Cycle-canceling with 'Karp' algorithm

  • "csflow" — Cost-scaling network flow

  • "network_simplex" — 'Network simplex' with spanning tree representation

  • "orlin" — 'Orlin-Ahuja' scaling O(sqrt(n) * m * log(nC))

  • "push_relabel" — 'Push-relabel' max-flow based solver

  • "ramshaw_tarjan" — 'Ramshaw-Tarjan', optimized for rectangular matrices (n != m)

auction_eps

Optional numeric epsilon for the 'Auction'/'Auction-GS' methods. If NULL, an internal default (e.g., 1e-9) is used.

eps

Deprecated. Use auction_eps. If provided and auction_eps is NULL, its value is used for auction_eps.

Details

method = "auto" selects an algorithm based on problem size/shape and data characteristics:

Benchmarks show 'Auction-scaled' and 'JV' are 100-1500x faster than 'Hungarian' at n=500.

Value

An object of class lap_solve_result, a list with elements:

See Also

Examples

cost <- matrix(c(4,2,5, 3,3,6, 7,5,4), nrow = 3, byrow = TRUE)
res  <- assignment(cost)
res$match; res$total_cost


Solve assignment problem and return dual variables

Description

Solves the linear assignment problem and returns dual potentials (u, v) in addition to the optimal matching. The dual variables provide an optimality certificate and enable sensitivity analysis.

Usage

assignment_duals(cost, maximize = FALSE)

Arguments

cost

Numeric matrix; rows = tasks, columns = agents. NA or Inf entries are treated as forbidden assignments.

maximize

Logical; if TRUE, maximizes the total cost instead of minimizing.

Details

The dual variables satisfy the complementary slackness conditions:

This implies that sum(u) + sum(v) = total_cost (strong duality).

Applications of dual variables:

Value

A list with class "assignment_duals_result" containing:

See Also

assignment() for standard assignment without duals

Examples

cost <- matrix(c(4, 2, 5, 3, 3, 6, 7, 5, 4), nrow = 3, byrow = TRUE)
result <- assignment_duals(cost)

# Check optimality: u + v should equal cost for assigned pairs
for (i in 1:3) {
  j <- result$match[i]
  cat(sprintf("Row %d -> Col %d: u + v = %.2f, cost = %.2f\n",
              i, j, result$u[i] + result$v[j], cost[i, j]))
}

# Verify strong duality
cat("sum(u) + sum(v) =", sum(result$u) + sum(result$v), "\n")
cat("total_cost =", result$total_cost, "\n")

# Reduced costs (how much must cost decrease to enter solution)
reduced <- outer(result$u, result$v, "+")
reduced_cost <- cost - reduced
print(round(reduced_cost, 2))


Generic Augment Function

Description

S3 generic for augmenting model results with original data.

Usage

augment(x, ...)

Arguments

x

An object to augment

...

Additional arguments passed to methods

Value

Augmented data (depends on method)


Augment Matching Results with Original Data (broom-style)

Description

S3 method for augmenting matching results following the broom package conventions. This is a thin wrapper around join_matched() with sensible defaults for quick exploration.

Usage

## S3 method for class 'matching_result'
augment(x, left, right, ...)

Arguments

x

A matching_result object

left

The original left dataset

right

The original right dataset

...

Additional arguments passed to join_matched()

Details

This method follows the augment() convention from the broom package, making it easy to integrate couplr into tidymodels workflows. It's equivalent to calling join_matched() with default parameters.

If the broom package is not loaded, you can use couplr::augment() to access this function.

Value

A tibble with matched pairs and original data (see join_matched())

Examples

left <- data.frame(
  id = 1:5,
  treatment = 1,
  age = c(25, 30, 35, 40, 45)
)

right <- data.frame(
  id = 6:10,
  treatment = 0,
  age = c(24, 29, 36, 41, 44)
)

result <- match_couples(left, right, vars = "age")
couplr::augment(result, left, right)


Automatically encode categorical variables

Description

Converts categorical variables to numeric representations suitable for matching. Currently supports binary variables (0/1) and ordered factors.

Usage

auto_encode_categorical(left, right, var)

Arguments

left

Data frame of left units

right

Data frame of right units

var

Variable name to encode

Value

List with encoded left and right columns, plus encoding metadata


Balance Diagnostics for Matched Pairs

Description

Computes comprehensive balance statistics comparing the distribution of matching variables between left and right units in the matched sample.

Usage

balance_diagnostics(
  result,
  left,
  right,
  vars = NULL,
  left_id = "id",
  right_id = "id"
)

Arguments

result

A matching result object from match_couples() or greedy_couples()

left

Data frame of left units

right

Data frame of right units

vars

Character vector of variable names to check balance for. Defaults to the variables used in matching (if available in result).

left_id

Character, name of ID column in left data (default: "id")

right_id

Character, name of ID column in right data (default: "id")

Details

This function computes several balance metrics:

Standardized Difference: The difference in means divided by the pooled standard deviation. Values less than 0.1 indicate excellent balance, 0.1-0.25 good balance.

Variance Ratio: The ratio of standard deviations (left/right). Values close to 1 are ideal.

KS Statistic: Kolmogorov-Smirnov test statistic comparing distributions. Lower values indicate more similar distributions.

Overall Metrics include mean absolute standardized difference across all variables, proportion of variables with large imbalance (|std diff| > 0.25), and maximum standardized difference.

Value

An S3 object of class balance_diagnostics containing:

var_stats

Tibble with per-variable balance statistics

overall

List with overall balance metrics

pairs

Tibble of matched pairs with variables

n_matched

Number of matched pairs

n_unmatched_left

Number of unmatched left units

n_unmatched_right

Number of unmatched right units

method

Matching method used

has_blocks

Whether blocking was used

block_stats

Per-block statistics (if blocking used)

Examples

# Create sample data
set.seed(123)
left <- data.frame(
  id = 1:10,
  age = rnorm(10, 45, 10),
  income = rnorm(10, 50000, 15000)
)
right <- data.frame(
  id = 11:30,
  age = rnorm(20, 47, 10),
  income = rnorm(20, 52000, 15000)
)

# Match
result <- match_couples(left, right, vars = c("age", "income"))

# Get balance diagnostics
balance <- balance_diagnostics(result, left, right, vars = c("age", "income"))
print(balance)

# Get balance table
balance_table(balance)


Create Balance Table

Description

Formats balance diagnostics into a clean table for display or export.

Usage

balance_table(balance, digits = 3)

Arguments

balance

A balance_diagnostics object from balance_diagnostics()

digits

Number of decimal places for rounding (default: 3)

Value

A tibble with formatted balance statistics


Solve the Bottleneck Assignment Problem

Description

Finds an assignment that minimizes (or maximizes) the maximum edge cost in a perfect matching. Unlike standard LAP which minimizes the sum of costs, BAP minimizes the maximum (bottleneck) cost.

Usage

bottleneck_assignment(cost, maximize = FALSE)

Arguments

cost

Numeric matrix; rows = tasks, columns = agents. NA or Inf entries are treated as forbidden assignments.

maximize

Logical; if TRUE, maximizes the minimum edge cost instead of minimizing the maximum (maximin objective). Default is FALSE (minimax).

Details

The Bottleneck Assignment Problem (BAP) is a variant of the Linear Assignment Problem where instead of minimizing the sum of assignment costs, we minimize the maximum cost among all assignments (minimax objective).

Algorithm: Uses binary search on the sorted unique costs combined with 'Hopcroft-Karp' bipartite matching to find the minimum threshold that allows a perfect matching.

Complexity: O(E * sqrt(V) * log(unique costs)) where E = edges, V = vertices.

Applications:

Value

A list with class "bottleneck_result" containing:

See Also

assignment() for standard LAP (sum objective), lap_solve() for tidy LAP interface

Examples

# Simple example: minimize max cost
cost <- matrix(c(1, 5, 3,
                 2, 4, 6,
                 7, 1, 2), nrow = 3, byrow = TRUE)
result <- bottleneck_assignment(cost)
result$bottleneck  # Maximum edge cost in optimal assignment

# Maximize minimum (fair allocation)
profits <- matrix(c(10, 5, 8,
                    6, 12, 4,
                    3, 7, 11), nrow = 3, byrow = TRUE)
result <- bottleneck_assignment(profits, maximize = TRUE)
result$bottleneck  # Minimum profit among all assignments

# With forbidden assignments
cost <- matrix(c(1, NA, 3,
                 2, 4, Inf,
                 5, 1, 2), nrow = 3, byrow = TRUE)
result <- bottleneck_assignment(cost)


Build cost matrix for matching

Description

This is the main entry point for distance computation.

Usage

build_cost_matrix(
  left,
  right,
  vars,
  distance = "euclidean",
  weights = NULL,
  scale = FALSE
)

Value

Numeric matrix of distances with optional scaling/weights applied.


Calculate Variable-Level Balance Statistics

Description

Calculate Variable-Level Balance Statistics

Usage

calculate_var_balance(left_vals, right_vals, var_name)

Arguments

left_vals

Numeric vector of values from left group

right_vals

Numeric vector of values from right group

var_name

Character, name of the variable

Value

List with balance statistics for this variable


Check if parallel processing is available

Description

Check if parallel processing is available

Usage

can_parallelize()

Value

Logical indicating if future package is available


Check cost distribution for problems

Description

Examines the distance matrix for common issues and provides helpful warnings.

Usage

check_cost_distribution(cost_matrix, threshold_zero = 1e-10, warn = TRUE)

Arguments

cost_matrix

Numeric matrix of distances

threshold_zero

Threshold for considering distance "zero" (default: 1e-10)

warn

If TRUE, issue warnings for problems found

Value

List with diagnostic information


Check if full matching was achieved

Description

Check if full matching was achieved

Usage

check_full_matching(result)

Value

No return value; throws error if unmatched units exist.


Check variable health for matching

Description

Analyzes variables for common problems that can affect matching quality: constant columns, high missingness, extreme skewness, and outliers.

Usage

check_variable_health(
  left,
  right,
  vars,
  high_missingness_threshold = 0.5,
  low_variance_threshold = 1e-06
)

Arguments

left

Data frame of left units

right

Data frame of right units

vars

Character vector of variable names to check

high_missingness_threshold

Threshold for high missingness warning (default: 0.5)

low_variance_threshold

Threshold for nearly-constant variables (default: 1e-6)

Value

A list with class "variable_health" containing:


Compute pairwise distance matrix

Description

Compute pairwise distance matrix

Usage

compute_distance_matrix(left_mat, right_mat, distance = "euclidean")

Value

Numeric matrix of pairwise distances (n_left x n_right).


Compute and Cache Distance Matrix for Reuse

Description

Precomputes a distance matrix between left and right datasets, allowing it to be reused across multiple matching operations with different constraints. This is particularly useful when exploring different matching parameters (max_distance, calipers, methods) without recomputing distances.

Usage

compute_distances(
  left,
  right,
  vars,
  distance = "euclidean",
  weights = NULL,
  scale = FALSE,
  auto_scale = FALSE,
  left_id = "id",
  right_id = "id",
  block_id = NULL
)

Arguments

left

Left dataset (data frame)

right

Right dataset (data frame)

vars

Character vector of variable names to use for distance computation

distance

Distance metric (default: "euclidean")

weights

Optional numeric vector of variable weights

scale

Scaling method: FALSE, "standardize", "range", or "robust"

auto_scale

Apply automatic preprocessing (default: FALSE)

left_id

Name of ID column in left (default: "id")

right_id

Name of ID column in right (default: "id")

block_id

Optional block ID column name for blocked matching

Details

This function computes distances once and stores them in a reusable object. The resulting distance_object can be passed to match_couples() or greedy_couples() instead of providing datasets and variables.

Benefits:

The distance_object stores the original datasets, allowing downstream functions like join_matched() to work seamlessly.

Value

An S3 object of class "distance_object" containing:

Examples

# Compute distances once
left <- data.frame(id = 1:5, age = c(25, 30, 35, 40, 45), income = c(45, 52, 48, 61, 55) * 1000)
right <- data.frame(id = 6:10, age = c(24, 29, 36, 41, 44), income = c(46, 51, 47, 60, 54) * 1000)

dist_obj <- compute_distances(
  left, right,
  vars = c("age", "income"),
  scale = "standardize"
)

# Reuse for different matching strategies
result1 <- match_couples(dist_obj, max_distance = 0.5)
result2 <- match_couples(dist_obj, max_distance = 1.0)
result3 <- greedy_couples(dist_obj, strategy = "sorted")

# All use the same precomputed distances


Count valid pairs in cost matrix

Description

Count valid pairs in cost matrix

Usage

count_valid_pairs(cost_matrix)

Value

Integer count of valid (non-forbidden) pairs.


Get a themed emoji

Description

Get a themed emoji

Usage

couplr_emoji(
  type = c("error", "warning", "info", "success", "heart", "broken", "sparkles",
    "search", "chart", "warning_sign", "stop", "check")
)

Value

Character string with the emoji (or empty string if emoji disabled).


Info message with emoji

Description

Info message with emoji

Usage

couplr_inform(...)

Value

No return value, called for side effects (issues a message).


Couplr message helpers with emoji and humor

Description

Light, fun error/warning messages inspired by testthat, themed around coupling and matching. Makes errors less intimidating and more memorable.


Stop with a fun, themed error message

Description

Stop with a fun, themed error message

Usage

couplr_stop(..., call. = FALSE)

Value

No return value, throws an error.


Success message with emoji

Description

Success message with emoji

Usage

couplr_success(...)

Value

No return value, called for side effects (issues a message).


Warn with a fun, themed warning message

Description

Warn with a fun, themed warning message

Usage

couplr_warn(..., call. = FALSE)

Value

No return value, called for side effects (issues a warning).


Detect and validate blocking

Description

Detect and validate blocking

Usage

detect_blocking(left, right, block_id, ignore_blocks)

Value

List with use_blocking (logical) and block_col (character or NULL).


Diagnose distance matrix and suggest fixes

Description

Comprehensive diagnostics for a distance matrix with actionable suggestions.

Usage

diagnose_distance_matrix(
  cost_matrix,
  left = NULL,
  right = NULL,
  vars = NULL,
  warn = TRUE
)

Arguments

cost_matrix

Numeric matrix of distances

left

Left dataset (for variable checking)

right

Right dataset (for variable checking)

vars

Variables used for matching

warn

If TRUE, issue warnings

Value

List with diagnostic results and suggestions


Invalid parameter error

Description

Invalid parameter error

Usage

err_invalid_param(param, value, expected)

Value

No return value, throws an error.


Missing data error

Description

Missing data error

Usage

err_missing_data(dataset = "left")

Value

No return value, throws an error.


Missing variables error

Description

Missing variables error

Usage

err_missing_vars(vars, dataset = "left")

Value

No return value, throws an error.


All pairs forbidden error

Description

All pairs forbidden error

Usage

err_no_valid_pairs(reason = NULL)

Value

No return value, throws an error.


Example cost matrices for assignment problems

Description

Small example datasets for demonstrating couplr functionality across different assignment problem types: square, rectangular, sparse, and binary.

Usage

example_costs

Format

A list containing four example cost matrices:

simple_3x3

A 3x3 cost matrix with costs ranging from 2-7. Optimal assignment: row 1 -> col 2 (cost 2), row 2 -> col 1 (cost 3), row 3 -> col 3 (cost 4). Total optimal cost: 9.

rectangular_3x5

A 3x5 rectangular cost matrix demonstrating assignment when rows < columns. Each of 3 rows is assigned to one of 5 columns; 2 columns remain unassigned. Costs range 1-6.

sparse_with_na

A 3x3 matrix with NA values indicating forbidden assignments. Use this to test algorithms' handling of constraints. Position (1,3), (2,2), and (3,1) are forbidden.

binary_costs

A 3x3 matrix with binary (0/1) costs, suitable for testing the HK01 algorithm. Diagonal entries are 0 (preferred), off-diagonal entries are 1 (penalty).

Details

These matrices are designed to test different aspects of LAP solvers:

simple_3x3: Basic functionality test. Any correct solver should find total cost = 9.

rectangular_3x5: Tests handling of non-square problems. The optimal solution assigns all 3 rows with minimum total cost.

sparse_with_na: Tests constraint handling. Algorithms must avoid NA positions while finding an optimal assignment among valid entries.

binary_costs: Tests specialized binary cost algorithms. The optimal assignment uses all diagonal entries (total cost = 0).

See Also

lap_solve, example_df

Examples

# Simple 3x3 assignment
result <- lap_solve(example_costs$simple_3x3)
print(result)
# Optimal: sources 1,2,3 -> targets 2,1,3 with cost 9

# Rectangular problem (3 sources, 5 targets)
result <- lap_solve(example_costs$rectangular_3x5)
print(result)
# All 3 sources assigned; 2 targets unassigned

# Sparse problem with forbidden assignments
result <- lap_solve(example_costs$sparse_with_na)
print(result)
# Avoids NA positions

# Binary costs - test HK01 algorithm
result <- lap_solve(example_costs$binary_costs, method = "hk01")
print(result)
# Finds diagonal assignment (cost = 0)


Example assignment problem data frame

Description

A tidy data frame representation of assignment problems, suitable for use with grouped workflows and batch solving. Contains two independent 3x3 assignment problems in long format.

Usage

example_df

Format

A tibble with 18 rows and 4 columns:

sim

Simulation/problem identifier. Integer with values 1 or 2, distinguishing two independent assignment problems. Use with group_by(sim) for grouped solving.

source

Source node index. Integer 1-3 representing the row (source) in each 3x3 cost matrix.

target

Target node index. Integer 1-3 representing the column (target) in each 3x3 cost matrix.

cost

Cost of assigning source to target. Numeric values ranging from 1-7. Each source-target pair has exactly one cost entry.

Details

This dataset demonstrates couplr's data frame interface for LAP solving. The long format (one row per source-target pair) is converted internally to a cost matrix for solving.

Simulation 1: Costs from example_costs$simple_3x3

Simulation 2: Different cost structure

See Also

lap_solve, lap_solve_batch, example_costs

Examples

library(dplyr)

# Solve both problems with grouped workflow
example_df |>
  group_by(sim) |>
  lap_solve(source, target, cost)

# Batch solving for efficiency
example_df |>
  group_by(sim) |>
  lap_solve_batch(source, target, cost)

# Inspect the data structure
example_df |>
  group_by(sim) |>
  summarise(
    n_pairs = n(),
    min_cost = min(cost),
    max_cost = max(cost)
  )


Extract and standardize IDs from data frames

Description

Extract and standardize IDs from data frames

Usage

extract_ids(df, prefix = "id")

Value

Character vector of IDs.


Extract matching variables from data frame

Description

Extract matching variables from data frame

Usage

extract_matching_vars(df, vars)

Value

Numeric matrix of matching variables.


Filter blocks based on size and balance criteria

Description

Filter blocks based on size and balance criteria

Usage

filter_blocks(
  left,
  right,
  min_left,
  min_right,
  drop_imbalanced,
  imbalance_threshold
)

Value

List with filtered left/right data frames and dropped block info.


Standardize block ID column name

Description

Standardize block ID column name

Usage

get_block_id_column(df)

Value

Character string with column name, or NULL if not found.


Extract method used from assignment result

Description

Extract method used from assignment result

Usage

get_method_used(x)

Arguments

x

An assignment result object

Value

Character string indicating method used


Extract total cost from assignment result

Description

Extract total cost from assignment result

Usage

get_total_cost(x)

Arguments

x

An assignment result object

Value

Numeric total cost


Greedy match blocks in parallel

Description

Greedy match blocks in parallel

Usage

greedy_blocks_parallel(
  blocks,
  left,
  right,
  left_ids,
  right_ids,
  block_col,
  vars,
  distance,
  weights,
  scale,
  max_distance,
  calipers,
  strategy,
  parallel = FALSE
)

Arguments

blocks

Vector of block IDs

left

Left dataset with block_col

right

Right dataset with block_col

left_ids

IDs from left

right_ids

IDs from right

block_col

Name of blocking column

vars

Variables for matching

distance

Distance metric

weights

Variable weights

scale

Scaling method

max_distance

Maximum distance

calipers

Caliper constraints

strategy

Greedy strategy

parallel

Whether to use parallel processing

Value

List with combined results from all blocks


Fast approximate matching using greedy algorithm

Description

Performs fast one-to-one matching using greedy strategies. Does not guarantee optimal total distance but is much faster than match_couples() for large datasets. Supports blocking, distance constraints, and various distance metrics.

Usage

greedy_couples(
  left,
  right = NULL,
  vars = NULL,
  distance = "euclidean",
  weights = NULL,
  scale = FALSE,
  auto_scale = FALSE,
  max_distance = Inf,
  calipers = NULL,
  block_id = NULL,
  ignore_blocks = FALSE,
  require_full_matching = FALSE,
  strategy = c("row_best", "sorted", "pq"),
  return_unmatched = TRUE,
  return_diagnostics = FALSE,
  parallel = FALSE,
  check_costs = TRUE
)

Arguments

left

Data frame of "left" units (e.g., treated, cases)

right

Data frame of "right" units (e.g., control, controls)

vars

Variable names to use for distance computation

distance

Distance metric: "euclidean", "manhattan", "mahalanobis", or a custom function

weights

Optional named vector of variable weights

scale

Scaling method: FALSE (none), "standardize", "range", or "robust"

auto_scale

If TRUE, automatically check variable health and select scaling method (default: FALSE)

max_distance

Maximum allowed distance (pairs exceeding this are forbidden)

calipers

Named list of per-variable maximum absolute differences

block_id

Column name containing block IDs (for stratified matching)

ignore_blocks

If TRUE, ignore block_id even if present

require_full_matching

If TRUE, error if any units remain unmatched

strategy

Greedy strategy:

  • "row_best": For each row, find best available column (default)

  • "sorted": Sort all pairs by distance, greedily assign

  • "pq": Use priority queue (good for very large problems)

return_unmatched

Include unmatched units in output

return_diagnostics

Include detailed diagnostics in output

parallel

Enable parallel processing for blocked matching. Requires 'future' and 'future.apply' packages. Can be:

  • FALSE: Sequential processing (default)

  • TRUE: Auto-configure parallel backend

  • Character: Specify future plan (e.g., "multisession", "multicore")

check_costs

If TRUE, check distance distribution for potential problems and provide helpful warnings before matching (default: TRUE)

Details

Greedy strategies do not guarantee optimal total distance but are much faster:

Use greedy_couples when:

Value

A list with class "matching_result" (same structure as match_couples)

Examples

# Basic greedy matching
left <- data.frame(id = 1:100, x = rnorm(100))
right <- data.frame(id = 101:200, x = rnorm(100))
result <- greedy_couples(left, right, vars = "x")

# Compare to optimal
result_opt <- match_couples(left, right, vars = "x")
result_greedy <- greedy_couples(left, right, vars = "x")
result_greedy$info$total_distance / result_opt$info$total_distance  # Quality ratio


Greedy matching with blocking

Description

Greedy matching with blocking

Usage

greedy_couples_blocked(
  left,
  right,
  left_ids,
  right_ids,
  block_col,
  vars,
  distance,
  weights,
  scale,
  max_distance,
  calipers,
  strategy,
  parallel = FALSE
)

Value

List with pairs tibble and matching info.


Greedy Matching from Precomputed Distance Object

Description

Internal function to handle greedy matching when a distance_object is provided

Usage

greedy_couples_from_distance(
  dist_obj,
  max_distance = Inf,
  calipers = NULL,
  ignore_blocks = FALSE,
  require_full_matching = FALSE,
  strategy = "row_best",
  return_unmatched = TRUE,
  return_diagnostics = FALSE
)

Value

A matching_result object with pairs, info, and optional diagnostics.


Greedy matching without blocking

Description

Greedy matching without blocking

Usage

greedy_couples_single(
  left,
  right,
  left_ids,
  right_ids,
  vars,
  distance,
  weights,
  scale,
  max_distance,
  calipers,
  strategy
)

Value

List with pairs tibble and matching info.


Re-export of dplyr::group_by

Description

Re-export of dplyr::group_by

Value

See group_by.


Check if data frame has blocking information

Description

Check if data frame has blocking information

Usage

has_blocks(df)

Value

Logical indicating whether data frame has block ID column.


Check if any valid pairs exist

Description

Check if any valid pairs exist

Usage

has_valid_pairs(cost_matrix)

Value

Logical indicating whether any valid pairs exist.


Hospital staff scheduling example dataset

Description

A comprehensive example dataset for demonstrating couplr functionality across vignettes. Contains hospital staff scheduling data with nurses, shifts, costs, and preference scores suitable for assignment problems, as well as nurse characteristics for matching workflows.

Usage

hospital_staff

Format

A list containing eight related datasets:

basic_costs

A 10x10 numeric cost matrix for assigning 10 nurses to 10 shifts. Values range from approximately 1-15, where lower values indicate better fit (less overtime, matches skills, respects preferences). Use with lap_solve() for basic assignment.

preferences

A 10x10 numeric preference matrix on a 0-10 scale, where higher values indicate stronger nurse preference for a shift. Use with lap_solve(..., maximize = TRUE) to optimize preferences rather than minimize costs.

schedule_df

A tibble with 100 rows (10 nurses x 10 shifts) in long format for data frame workflows:

nurse_id

Integer 1-10. Unique identifier for each nurse.

shift_id

Integer 1-10. Unique identifier for each shift.

cost

Numeric. Assignment cost (same values as basic_costs).

preference

Numeric 0-10. Nurse preference score.

skill_match

Integer 0/1. Binary indicator: 1 if nurse skills match shift requirements, 0 otherwise.

nurses

A tibble with 10 rows describing nurse characteristics:

nurse_id

Integer 1-10. Links to schedule_df and basic_costs rows.

experience_years

Numeric 1-20. Years of nursing experience.

department

Character. Primary department: "ICU", "ER", "General", or "Pediatrics".

shift_preference

Character. Preferred shift type: "day", "evening", or "night".

certification_level

Integer 1-3. Certification level where 3 is highest (e.g., 1=RN, 2=BSN, 3=MSN).

shifts

A tibble with 10 rows describing shift requirements:

shift_id

Integer 1-10. Links to schedule_df and basic_costs cols.

department

Character. Department needing coverage.

shift_type

Character. Shift type: "day", "evening", or "night".

min_experience

Numeric. Minimum years of experience required.

min_certification

Integer 1-3. Minimum certification level.

weekly_df

A tibble for batch solving with 500 rows (5 days x 10 nurses x 10 shifts):

day

Character. Day of week: "Mon", "Tue", "Wed", "Thu", "Fri".

nurse_id

Integer 1-10. Nurse identifier.

shift_id

Integer 1-10. Shift identifier.

cost

Numeric. Daily assignment cost (varies by day).

preference

Numeric 0-10. Daily preference score.

Use with group_by(day) for solving each day's schedule.

nurses_extended

A tibble with 200 nurses for matching examples, representing a treatment group (e.g., full-time nurses):

nurse_id

Integer 1-200. Unique identifier.

age

Numeric 22-65. Nurse age in years.

experience_years

Numeric 0-40. Years of nursing experience.

hourly_rate

Numeric 25-75. Hourly wage in dollars.

department

Character. Primary department assignment.

certification_level

Integer 1-3. Certification level.

is_fulltime

Logical. TRUE for full-time status.

controls_extended

A tibble with 300 potential control nurses (e.g., part-time or registry nurses) for matching. Same structure as nurses_extended. Designed to have systematic differences from nurses_extended (older, less experience on average) to demonstrate matching's ability to create comparable groups.

Details

This dataset is used throughout the couplr documentation to provide a consistent, realistic example that evolves in complexity. It supports three use cases: (1) basic LAP solving with cost matrices, (2) batch solving across multiple days, and (3) matching workflows comparing nurse groups.

The dataset is designed to demonstrate progressively complex scenarios:

Basic LAP (vignette("getting-started")):

Algorithm comparison (vignette("algorithms")):

Matching workflows (vignette("matching-workflows")):

See Also

lap_solve for basic assignment solving, lap_solve_batch for batch solving, match_couples for matching workflows, vignette("getting-started") for introductory tutorial

Examples

# Basic assignment: assign nurses to shifts minimizing cost
lap_solve(hospital_staff$basic_costs)

# Maximize preferences instead
lap_solve(hospital_staff$preferences, maximize = TRUE)

# Data frame workflow
library(dplyr)
hospital_staff$schedule_df |>
  lap_solve(nurse_id, shift_id, cost)

# Batch solve weekly schedule
hospital_staff$weekly_df |>
  group_by(day) |>
  lap_solve(nurse_id, shift_id, cost)

# Matching workflow: match full-time to part-time nurses
match_couples(
  left = hospital_staff$nurses_extended,
  right = hospital_staff$controls_extended,
  vars = c("age", "experience_years", "certification_level"),
  auto_scale = TRUE
)


Low match rate info

Description

Low match rate info

Usage

info_low_match_rate(n_matched, n_left, pct)

Value

No return value, called for side effects (issues a message or warning).


Check if Object is a Distance Object

Description

Check if Object is a Distance Object

Usage

is_distance_object(x)

Arguments

x

Object to check

Value

Logical: TRUE if x is a distance_object

Examples

left <- data.frame(id = 1:3, x = c(1, 2, 3))
right <- data.frame(id = 4:6, x = c(1.1, 2.1, 3.1))
dist_obj <- compute_distances(left, right, vars = "x")
is_distance_object(dist_obj)  # TRUE
is_distance_object(list())    # FALSE


Check if object is a batch assignment result

Description

Check if object is a batch assignment result

Usage

is_lap_solve_batch_result(x)

Arguments

x

Object to test

Value

Logical indicating if x is a batch assignment result


Check if object is a k-best assignment result

Description

Check if object is a k-best assignment result

Usage

is_lap_solve_kbest_result(x)

Arguments

x

Object to test

Value

Logical indicating if x is a k-best assignment result


Check if object is an assignment result

Description

Check if object is an assignment result

Usage

is_lap_solve_result(x)

Arguments

x

Object to test

Value

Logical indicating if x is an assignment result


Join Matched Pairs with Original Data

Description

Creates an analysis-ready dataset by joining matched pairs with variables from the original left and right datasets. This eliminates the need for manual joins and provides a convenient format for downstream analysis.

Usage

join_matched(
  result,
  left,
  right,
  left_vars = NULL,
  right_vars = NULL,
  left_id = "id",
  right_id = "id",
  suffix = c("_left", "_right"),
  include_distance = TRUE,
  include_pair_id = TRUE,
  include_block_id = TRUE
)

Arguments

result

A matching_result object from match_couples() or greedy_couples()

left

The original left dataset

right

The original right dataset

left_vars

Character vector of variable names to include from left. If NULL (default), includes all variables except the ID column.

right_vars

Character vector of variable names to include from right. If NULL (default), includes all variables except the ID column.

left_id

Name of the ID column in left dataset (default: "id")

right_id

Name of the ID column in right dataset (default: "id")

suffix

Character vector of length 2 specifying suffixes for left and right variables (default: c("_left", "_right"))

include_distance

Include the matching distance in output (default: TRUE)

include_pair_id

Include pair_id column (default: TRUE)

include_block_id

Include block_id if blocking was used (default: TRUE)

Details

This function simplifies the common workflow of joining matched pairs with original data. Instead of manually merging result$pairs with left and right datasets, join_matched() handles the joins automatically and applies consistent naming conventions.

When variables appear in both left and right datasets, suffixes are appended to distinguish them (e.g., "age_left" and "age_right"). This makes it easy to compute differences or use both values in models.

Value

A tibble with one row per matched pair, containing:

Examples

# Basic usage
left <- data.frame(
  id = 1:5,
  treatment = 1,
  age = c(25, 30, 35, 40, 45),
  income = c(45000, 52000, 48000, 61000, 55000)
)

right <- data.frame(
  id = 6:10,
  treatment = 0,
  age = c(24, 29, 36, 41, 44),
  income = c(46000, 51500, 47500, 60000, 54000)
)

result <- match_couples(left, right, vars = c("age", "income"))
matched_data <- join_matched(result, left, right)
head(matched_data)

# Specify which variables to include
matched_data <- join_matched(
  result, left, right,
  left_vars = c("treatment", "age", "income"),
  right_vars = c("age", "income"),
  suffix = c("_treated", "_control")
)

# Without distance or pair_id
matched_data <- join_matched(
  result, left, right,
  include_distance = FALSE,
  include_pair_id = FALSE
)


Solve linear assignment problems

Description

Provides a tidy interface for solving the linear assignment problem using 'Hungarian' or 'Jonker-Volgenant' algorithms. Supports rectangular matrices, NA/Inf masking, and data frame inputs.

Usage

lap_solve(
  x,
  source = NULL,
  target = NULL,
  cost = NULL,
  maximize = FALSE,
  method = "auto",
  forbidden = NA
)

Arguments

x

Cost matrix, data frame, or tibble. If a data frame/tibble, must include columns specified by source, target, and cost.

source

Column name for source/row indices (if x is a data frame)

target

Column name for target/column indices (if x is a data frame)

cost

Column name for costs (if x is a data frame)

maximize

Logical; if TRUE, maximizes total cost instead of minimizing (default: FALSE)

method

Algorithm to use. One of:

  • "auto" (default): automatically selects best algorithm

  • "jv": 'Jonker-Volgenant' algorithm (general purpose, fast)

  • "hungarian": Classic 'Hungarian' algorithm

  • "auction": 'Bertsekas' auction algorithm (good for large dense problems)

  • "sap": Sparse assignment (good for sparse/rectangular problems)

  • "hk01": 'Hopcroft-Karp' for binary/uniform costs

forbidden

Value to mark forbidden assignments (default: NA). Can also use Inf.

Value

A tibble with columns:

Examples

# Matrix input
cost <- matrix(c(4, 2, 5, 3, 3, 6, 7, 5, 4), nrow = 3)
lap_solve(cost)

# Data frame input
library(dplyr)
df <- tibble(
  source = rep(1:3, each = 3),
  target = rep(1:3, times = 3),
  cost = c(4, 2, 5, 3, 3, 6, 7, 5, 4)
)
lap_solve(df, source, target, cost)

# With NA masking (forbidden assignments)
cost[1, 3] <- NA
lap_solve(cost)

# Grouped data frames
df <- tibble(
  sim = rep(1:2, each = 9),
  source = rep(1:3, times = 6),
  target = rep(1:3, each = 3, times = 2),
  cost = runif(18, 1, 10)
)
df |> group_by(sim) |> lap_solve(source, target, cost)


Solve multiple assignment problems efficiently

Description

Solve many independent assignment problems at once. Supports lists of matrices, 3D arrays, or grouped data frames. Optional parallel execution via n_threads.

Usage

lap_solve_batch(
  x,
  source = NULL,
  target = NULL,
  cost = NULL,
  maximize = FALSE,
  method = "auto",
  n_threads = 1,
  forbidden = NA
)

Arguments

x

One of: List of cost matrices, 3D array, or grouped data frame

source

Column name for source indices (if x is a grouped data frame)

target

Column name for target indices (if x is a grouped data frame)

cost

Column name for costs (if x is a grouped data frame)

maximize

Logical; if TRUE, maximizes total cost (default: FALSE)

method

Algorithm to use (default: "auto"). See lap_solve for options.

n_threads

Number of threads for parallel execution (default: 1). Set to NULL to use all available cores.

forbidden

Value to mark forbidden assignments (default: NA)

Value

A tibble with columns:

Examples

# List of matrices
costs <- list(
  matrix(c(1, 2, 3, 4), 2, 2),
  matrix(c(5, 6, 7, 8), 2, 2)
)
lap_solve_batch(costs)

# 3D array
arr <- array(runif(2 * 2 * 10), dim = c(2, 2, 10))
lap_solve_batch(arr)

# Grouped data frame
library(dplyr)
df <- tibble(
  sim = rep(1:5, each = 9),
  source = rep(1:3, times = 15),
  target = rep(1:3, each = 3, times = 5),
  cost = runif(45, 1, 10)
)
df |> group_by(sim) |> lap_solve_batch(source, target, cost)

# Parallel execution (requires n_threads > 1)
lap_solve_batch(costs, n_threads = 2)


Find k-best optimal assignments

Description

Returns the top k optimal (or near-optimal) assignments using 'Murty' algorithm. Useful for exploring alternative optimal solutions or finding robust assignments.

Usage

lap_solve_kbest(
  x,
  k = 3,
  source = NULL,
  target = NULL,
  cost = NULL,
  maximize = FALSE,
  method = "murty",
  single_method = "jv",
  forbidden = NA
)

Arguments

x

Cost matrix, data frame, or tibble. If a data frame/tibble, must include columns specified by source, target, and cost.

k

Number of best solutions to return (default: 3)

source

Column name for source/row indices (if x is a data frame)

target

Column name for target/column indices (if x is a data frame)

cost

Column name for costs (if x is a data frame)

maximize

Logical; if TRUE, finds k-best maximizing assignments (default: FALSE)

method

Algorithm for each sub-problem (default: "murty"). Future versions may support additional methods.

single_method

Algorithm used for solving each node in the search tree (default: "jv")

forbidden

Value to mark forbidden assignments (default: NA)

Value

A tibble with columns:

Examples

# Matrix input - find 5 best solutions
cost <- matrix(c(4, 2, 5, 3, 3, 6, 7, 5, 4), nrow = 3)
lap_solve_kbest(cost, k = 5)

# Data frame input
library(dplyr)
df <- tibble(
  source = rep(1:3, each = 3),
  target = rep(1:3, times = 3),
  cost = c(4, 2, 5, 3, 3, 6, 7, 5, 4)
)
lap_solve_kbest(df, k = 3, source, target, cost)

# With maximization
lap_solve_kbest(cost, k = 3, maximize = TRUE)


Solve 1-D Line Assignment Problem

Description

Solves the linear assignment problem when both sources and targets are ordered points on a line. Uses efficient O(n*m) dynamic programming for rectangular problems and O(n) sorting for square problems.

Usage

lap_solve_line_metric(x, y, cost = "L1", maximize = FALSE)

Arguments

x

Numeric vector of source positions (will be sorted internally)

y

Numeric vector of target positions (will be sorted internally)

cost

Cost function for distance. Either:

  • "L1" (default): absolute distance ('Manhattan' distance)

  • "L2": squared distance (squared 'Euclidean' distance) Can also use aliases: "abs", "manhattan" for L1; "sq", "squared", "quadratic" for L2

maximize

Logical; if TRUE, maximizes total cost instead of minimizing (default: FALSE)

Details

This is a specialized solver that exploits the structure of 1-dimensional assignment problems where costs depend only on the distance between points on a line. It is much faster than general LAP solvers for this special case.

The algorithm works as follows:

Square case (n == m): Both vectors are sorted and matched in order: x[1] -> y[1], x[2] -> y[2], etc. This is optimal for any metric cost function on a line.

Rectangular case (n < m): Uses dynamic programming to find the optimal assignment that matches all n sources to a subset of the m targets, minimizing total distance. The DP recurrence is:

dp[i][j] = min(dp[i][j-1], dp[i-1][j-1] + cost(x[i], y[j]))

This finds the minimum cost to match the first i sources to the first j targets.

Complexity:

Value

A list with components:

Examples

# Square case: equal number of sources and targets
x <- c(1.5, 3.2, 5.1)
y <- c(2.0, 3.0, 5.5)
result <- lap_solve_line_metric(x, y, cost = "L1")
print(result)

# Rectangular case: more targets than sources
x <- c(1.0, 3.0, 5.0)
y <- c(0.5, 2.0, 3.5, 4.5, 6.0)
result <- lap_solve_line_metric(x, y, cost = "L2")
print(result)

# With unsorted inputs (will be sorted internally)
x <- c(5.0, 1.0, 3.0)
y <- c(4.5, 0.5, 6.0, 2.0, 3.5)
result <- lap_solve_line_metric(x, y, cost = "L1")
print(result)


Mark forbidden pairs

Description

Generic function to mark specific pairs as forbidden.

Usage

mark_forbidden_pairs(cost_matrix, forbidden_indices)

Value

Modified cost matrix with forbidden pairs marked.


Match blocks in parallel

Description

Match blocks in parallel

Usage

match_blocks_parallel(
  blocks,
  left,
  right,
  left_ids,
  right_ids,
  block_col,
  vars,
  distance,
  weights,
  scale,
  max_distance,
  calipers,
  method,
  parallel = FALSE
)

Arguments

blocks

Vector of block IDs

left

Left dataset with block_col

right

Right dataset with block_col

left_ids

IDs from left

right_ids

IDs from right

block_col

Name of blocking column

vars

Variables for matching

distance

Distance metric

weights

Variable weights

scale

Scaling method

max_distance

Maximum distance

calipers

Caliper constraints

method

LAP method

parallel

Whether to use parallel processing

Value

List with combined results from all blocks


Optimal matching using linear assignment

Description

Performs optimal one-to-one matching between two datasets using linear assignment problem (LAP) solvers. Supports blocking, distance constraints, and various distance metrics.

Usage

match_couples(
  left,
  right = NULL,
  vars = NULL,
  distance = "euclidean",
  weights = NULL,
  scale = FALSE,
  auto_scale = FALSE,
  max_distance = Inf,
  calipers = NULL,
  block_id = NULL,
  ignore_blocks = FALSE,
  require_full_matching = FALSE,
  method = "auto",
  return_unmatched = TRUE,
  return_diagnostics = FALSE,
  parallel = FALSE,
  check_costs = TRUE
)

Arguments

left

Data frame of "left" units (e.g., treated, cases)

right

Data frame of "right" units (e.g., control, controls)

vars

Variable names to use for distance computation

distance

Distance metric: "euclidean", "manhattan", "mahalanobis", or a custom function

weights

Optional named vector of variable weights

scale

Scaling method: FALSE (none), "standardize", "range", or "robust"

auto_scale

If TRUE, automatically check variable health and select scaling method (default: FALSE)

max_distance

Maximum allowed distance (pairs exceeding this are forbidden)

calipers

Named list of per-variable maximum absolute differences

block_id

Column name containing block IDs (for stratified matching)

ignore_blocks

If TRUE, ignore block_id even if present

require_full_matching

If TRUE, error if any units remain unmatched

method

LAP solver: "auto", "hungarian", "jv", "gabow_tarjan", etc.

return_unmatched

Include unmatched units in output

return_diagnostics

Include detailed diagnostics in output

parallel

Enable parallel processing for blocked matching. Requires 'future' and 'future.apply' packages. Can be:

  • FALSE: Sequential processing (default)

  • TRUE: Auto-configure parallel backend

  • Character: Specify future plan (e.g., "multisession", "multicore")

check_costs

If TRUE, check distance distribution for potential problems and provide helpful warnings before matching (default: TRUE)

Details

This function finds the matching that minimizes total distance among all feasible matchings, subject to constraints. Use greedy_couples() for faster approximate matching on large datasets.

Value

A list with class "matching_result" containing:

Examples

# Basic matching
left <- data.frame(id = 1:5, x = c(1, 2, 3, 4, 5), y = c(2, 4, 6, 8, 10))
right <- data.frame(id = 6:10, x = c(1.1, 2.2, 3.1, 4.2, 5.1), y = c(2.1, 4.1, 6.2, 8.1, 10.1))
result <- match_couples(left, right, vars = c("x", "y"))
print(result$pairs)

# With constraints
result <- match_couples(left, right, vars = c("x", "y"),
                        max_distance = 1,
                        calipers = list(x = 0.5))

# With blocking
left$region <- c("A", "A", "B", "B", "B")
right$region <- c("A", "A", "B", "B", "B")
blocks <- matchmaker(left, right, block_type = "group", block_by = "region")
result <- match_couples(blocks$left, blocks$right, vars = c("x", "y"))


Match with blocking (multiple problems)

Description

Match with blocking (multiple problems)

Usage

match_couples_blocked(
  left,
  right,
  left_ids,
  right_ids,
  block_col,
  vars,
  distance,
  weights,
  scale,
  max_distance,
  calipers,
  method,
  parallel = FALSE
)

Value

List with pairs tibble and matching info.


Match from Precomputed Distance Object

Description

Internal function to handle matching when a distance_object is provided

Usage

match_couples_from_distance(
  dist_obj,
  max_distance = Inf,
  calipers = NULL,
  ignore_blocks = FALSE,
  require_full_matching = FALSE,
  method = "auto",
  return_unmatched = TRUE,
  return_diagnostics = FALSE,
  check_costs = TRUE
)

Value

A matching_result object with pairs, info, and optional diagnostics.


Match without blocking (single problem)

Description

Match without blocking (single problem)

Usage

match_couples_single(
  left,
  right,
  left_ids,
  right_ids,
  vars,
  distance,
  weights,
  scale,
  max_distance,
  calipers,
  method,
  check_costs = TRUE
)

Value

List with pairs tibble and matching info.


Create blocks for stratified matching

Description

Constructs blocks (strata) for matching, using either grouping variables or clustering algorithms. Returns the input data frames with block IDs assigned, along with block summary statistics.

Usage

matchmaker(
  left,
  right,
  block_type = c("none", "group", "cluster"),
  block_by = NULL,
  block_vars = NULL,
  block_method = "kmeans",
  n_blocks = NULL,
  min_left = 1,
  min_right = 1,
  drop_imbalanced = FALSE,
  imbalance_threshold = Inf,
  return_dropped = TRUE,
  ...
)

Arguments

left

Data frame of "left" units (e.g., treated, cases)

right

Data frame of "right" units (e.g., control, controls)

block_type

Type of blocking to use:

  • "none": No blocking (default)

  • "group": Block by existing categorical variable(s)

  • "cluster": Block using clustering algorithm

block_by

Variable name(s) for grouping (if block_type = "group")

block_vars

Variable names for clustering (if block_type = "cluster")

block_method

Clustering method (if block_type = "cluster"):

  • "kmeans": K-means clustering

  • "hclust": Hierarchical clustering

n_blocks

Target number of blocks (for clustering)

min_left

Minimum number of left units per block

min_right

Minimum number of right units per block

drop_imbalanced

Drop blocks with extreme imbalance

imbalance_threshold

Maximum allowed |n_left - n_right| / max(n_left, n_right)

return_dropped

Include dropped blocks in output

...

Additional arguments passed to clustering function

Details

This function does NOT perform matching - it only creates the block structure. Use match_couples() or greedy_couples() to perform matching within blocks.

Value

A list with class "matchmaker_result" containing:

Examples

# Group blocking
left <- data.frame(id = 1:10, region = rep(c("A", "B"), each = 5), x = rnorm(10))
right <- data.frame(id = 11:20, region = rep(c("A", "B"), each = 5), x = rnorm(10))
blocks <- matchmaker(left, right, block_type = "group", block_by = "region")
print(blocks$block_summary)

# Clustering
blocks <- matchmaker(left, right, block_type = "cluster",
                     block_vars = "x", n_blocks = 3)


Parallel lapply using future

Description

Parallel lapply using future

Usage

parallel_lapply(X, FUN, ..., parallel = FALSE)

Arguments

X

Vector to iterate over

FUN

Function to apply

...

Additional arguments to FUN

parallel

Whether parallel processing is enabled

Value

List of results


Pixel-level image morphing (final frame only)

Description

Computes optimal pixel assignment from A to B and returns the final transported frame (without intermediate animation frames).

Usage

pixel_morph(
  imgA,
  imgB,
  n_frames = 16L,
  mode = c("color_walk", "exact", "recursive"),
  lap_method = "jv",
  maximize = FALSE,
  quantize_bits = 5L,
  downscale_steps = 0L,
  alpha = 1,
  beta = 0,
  patch_size = 1L,
  upscale = 1,
  show = interactive()
)

Arguments

imgA

Source image (file path or magick image object)

imgB

Target image (file path or magick image object)

n_frames

Internal parameter for rendering (default: 16)

mode

Assignment algorithm: "color_walk" (default), "exact", or "recursive"

lap_method

LAP solver method (default: "jv")

maximize

Logical, maximize instead of minimize cost (default: FALSE)

quantize_bits

Color quantization for "color_walk" mode (default: 5)

downscale_steps

Number of 2x reductions before computing assignment (default: 0)

alpha

Weight for color distance in cost function (default: 1)

beta

Weight for spatial distance in cost function (default: 0)

patch_size

Tile size for tiled modes (default: 1)

upscale

Post-rendering upscaling factor (default: 1)

show

Logical, display result in viewer (default: interactive())

Details

Transport-Only Semantics

This function returns a SHARP, pixel-perfect transport of A's pixels to positions determined by the assignment to B.

Key Points:

See pixel_morph_animate for detailed explanation of assignment vs rendering semantics.

Permutation Warnings

Assignment is guaranteed to be a bijection (permutation) ONLY when:

With downscaling or tiled modes, assignment may have:

If assignment is not a bijection (due to downscaling or tiling), a warning will be issued. The result may contain:

For guaranteed pixel-perfect results, use:

  pixel_morph(A, B, mode = "exact", downscale_steps = 0)

Value

magick image object of the final transported frame

See Also

pixel_morph_animate for animated version

Examples

if (requireNamespace("magick", quietly = TRUE)) {
  imgA <- system.file("extdata/icons/circleA_40.png", package = "couplr")
  imgB <- system.file("extdata/icons/circleB_40.png", package = "couplr")
  if (nzchar(imgA) && nzchar(imgB)) {
    result <- pixel_morph(imgA, imgB, n_frames = 4, show = FALSE)
  }
}


Pixel-level image morphing (animation)

Description

Creates an animated morph by computing optimal pixel assignment from image A to image B, then rendering intermediate frames showing the transport.

Usage

pixel_morph_animate(
  imgA,
  imgB,
  n_frames = 16L,
  fps = 10L,
  format = c("gif", "webp", "mp4"),
  outfile = NULL,
  show = interactive(),
  mode = c("color_walk", "exact", "recursive"),
  lap_method = "jv",
  maximize = FALSE,
  quantize_bits = 5L,
  downscale_steps = 0L,
  alpha = 1,
  beta = 0,
  patch_size = 1L,
  upscale = 1
)

Arguments

imgA

Source image (file path or magick image object)

imgB

Target image (file path or magick image object)

n_frames

Integer number of animation frames (default: 16)

fps

Frames per second for playback (default: 10)

format

Output format: "gif", "webp", or "mp4"

outfile

Optional output file path

show

Logical, display animation in viewer (default: interactive())

mode

Assignment algorithm: "color_walk" (default), "exact", or "recursive"

lap_method

LAP solver method (default: "jv")

maximize

Logical, maximize instead of minimize cost (default: FALSE)

quantize_bits

Color quantization for "color_walk" mode (default: 5)

downscale_steps

Number of 2x reductions before computing assignment (default: 0)

alpha

Weight for color distance in cost function (default: 1)

beta

Weight for spatial distance in cost function (default: 0)

patch_size

Tile size for tiled modes (default: 1)

upscale

Post-rendering upscaling factor (default: 1)

Details

Assignment vs Rendering Semantics

CRITICAL: This function has two separate phases with different semantics:

Phase 1 - Assignment Computation:

The assignment is computed by minimizing:

  cost(i,j) = alpha * color_distance(A[i], B[j]) + 
              beta * spatial_distance(pos_i, pos_j)

This means B's COLORS influence which pixels from A map to which positions.

Phase 2 - Rendering (Transport-Only):

The renderer uses ONLY A's colors:

Result: You get A's colors rearranged to match B's geometry/layout.

What This Means

Parameter Guidance

For pure spatial rearrangement (ignore B's colors in assignment):

  pixel_morph_animate(A, B, alpha = 0, beta = 1)

For color-similarity matching (default):

  pixel_morph_animate(A, B, alpha = 1, beta = 0)

For hybrid (color + spatial):

  pixel_morph_animate(A, B, alpha = 1, beta = 0.2)

Permutation Guarantees

Assignment is guaranteed to be a bijection (permutation) ONLY when:

With downscaling or tiled modes, assignment may have:

A warning is issued if overlaps/holes are detected in the final frame.

Value

Invisibly returns a list with animation object and metadata:

animation

magick animation object

width

Image width in pixels

height

Image height in pixels

assignment

Integer vector of 1-based assignment indices (R convention)

n_pixels

Total number of pixels

mode

Mode used for matching

upscale

Upscaling factor applied

Examples

if (requireNamespace("magick", quietly = TRUE)) {
  imgA <- system.file("extdata/icons/circleA_40.png", package = "couplr")
  imgB <- system.file("extdata/icons/circleB_40.png", package = "couplr")
  if (nzchar(imgA) && nzchar(imgB)) {
    outfile <- tempfile(fileext = ".gif")
    pixel_morph_animate(imgA, imgB, outfile = outfile, n_frames = 4, show = FALSE)
  }
}


Plot method for balance diagnostics

Description

Produces a Love plot (dot plot) of standardized differences.

Usage

## S3 method for class 'balance_diagnostics'
plot(x, type = c("love", "histogram", "variance"), threshold = 0.1, ...)

Arguments

x

A balance_diagnostics object

type

Type of plot: "love" (default), "histogram", or "variance"

threshold

Threshold line for standardized differences (default: 0.1)

...

Additional arguments passed to plotting functions

Value

The balance_diagnostics object (invisibly)


Plot method for matching results

Description

Produces a histogram of pairwise distances from a matching result.

Usage

## S3 method for class 'matching_result'
plot(x, type = c("histogram", "density", "ecdf"), ...)

Arguments

x

A matching_result object

type

Type of plot: "histogram" (default), "density", or "ecdf"

...

Additional arguments passed to plotting functions

Value

The matching_result object (invisibly)


Preprocess matching variables with automatic checks and scaling

Description

Main preprocessing function that orchestrates variable health checks, categorical encoding, and automatic scaling selection.

Usage

preprocess_matching_vars(
  left,
  right,
  vars,
  auto_scale = TRUE,
  scale_method = "auto",
  check_health = TRUE,
  remove_problematic = TRUE,
  verbose = TRUE
)

Arguments

left

Data frame of left units

right

Data frame of right units

vars

Character vector of variable names

auto_scale

Logical, whether to perform automatic preprocessing (default: TRUE)

scale_method

Scaling method: "auto", "standardize", "range", "robust", or FALSE

check_health

Logical, whether to check variable health (default: TRUE)

remove_problematic

Logical, automatically exclude constant/all-NA variables (default: TRUE)

verbose

Logical, whether to print warnings (default: TRUE)

Value

A list with class "preprocessing_result" containing:


Print Method for Balance Diagnostics

Description

Print Method for Balance Diagnostics

Usage

## S3 method for class 'balance_diagnostics'
print(x, ...)

Arguments

x

A balance_diagnostics object

...

Additional arguments (ignored)

Value

Invisibly returns the input object x.


Print Method for Distance Objects

Description

Print Method for Distance Objects

Usage

## S3 method for class 'distance_object'
print(x, ...)

Arguments

x

A distance_object

...

Additional arguments (ignored)

Value

Invisibly returns the input object x.


Print method for batch assignment results

Description

Prints a summary and the table of results for a batch of assignment problems solved with lap_solve_batch().

Usage

## S3 method for class 'lap_solve_batch_result'
print(x, ...)

Arguments

x

A lap_solve_batch_result object.

...

Additional arguments passed to print(). Currently ignored.

Value

Invisibly returns the input object x.


Print method for k-best assignment results

Description

Print method for k-best assignment results

Usage

## S3 method for class 'lap_solve_kbest_result'
print(x, ...)

Arguments

x

A lap_solve_kbest_result.

...

Additional arguments passed to print(). Ignored.

Value

Invisibly returns the input object x.


Print method for assignment results

Description

Nicely prints a lap_solve_result object, including the assignments, total cost, and method used.

Usage

## S3 method for class 'lap_solve_result'
print(x, ...)

Arguments

x

A lap_solve_result object.

...

Additional arguments passed to print(). Currently ignored.

Value

Invisibly returns the input object x.


Print method for matching results

Description

Print method for matching results

Usage

## S3 method for class 'matching_result'
print(x, ...)

Arguments

x

A matching_result object

...

Additional arguments (ignored)

Value

Invisibly returns the input object x.


Print method for matchmaker results

Description

Print method for matchmaker results

Usage

## S3 method for class 'matchmaker_result'
print(x, ...)

Arguments

x

A matchmaker_result object

...

Additional arguments (ignored)

Value

Invisibly returns the input object x.


Print method for preprocessing result

Description

Print method for preprocessing result

Usage

## S3 method for class 'preprocessing_result'
print(x, ...)

Arguments

x

A preprocessing_result object

...

Additional arguments (ignored)

Value

Invisibly returns the input object x.


Print method for variable health

Description

Print method for variable health

Usage

## S3 method for class 'variable_health'
print(x, ...)

Arguments

x

A variable_health object

...

Additional arguments (ignored)

Value

Invisibly returns the input object x.


Restore original parallel plan

Description

Restore original parallel plan

Usage

restore_parallel(parallel_state)

Arguments

parallel_state

State from setup_parallel()

Value

No return value, called for side effects (restores parallel plan).


Setup parallel processing with future

Description

Setup parallel processing with future

Usage

setup_parallel(parallel = FALSE, n_workers = NULL)

Arguments

parallel

Logical or plan specification

n_workers

Number of workers (NULL for auto-detect)

Value

List with original plan and whether we set up parallelization


'Sinkhorn-Knopp' optimal transport solver

Description

Compute an entropy-regularized optimal transport plan using the 'Sinkhorn-Knopp' algorithm. Unlike other LAP solvers that return a hard 1-to-1 assignment, this returns a soft assignment (doubly stochastic matrix).

Usage

sinkhorn(
  cost,
  lambda = 10,
  tol = 1e-09,
  max_iter = 1000,
  r_weights = NULL,
  c_weights = NULL
)

Arguments

cost

Numeric matrix of transport costs. NA or Inf entries are treated as very high cost (effectively forbidden).

lambda

Regularization parameter (default 10). Higher values produce sharper (more deterministic) transport plans; lower values produce smoother distributions. Typical range: 1-100.

tol

Convergence tolerance (default 1e-9).

max_iter

Maximum iterations (default 1000).

r_weights

Optional numeric vector of row marginals (source distribution). Default is uniform. Will be normalized to sum to 1.

c_weights

Optional numeric vector of column marginals (target distribution). Default is uniform. Will be normalized to sum to 1.

Details

The 'Sinkhorn-Knopp' algorithm solves the entropy-regularized optimal transport problem:

P^* = \arg\min_P \langle C, P \rangle - \frac{1}{\lambda} H(P)

subject to row sums = r_weights and column sums = c_weights.

The entropy term H(P) encourages spread in the transport plan. As lambda -> Inf, the solution approaches the standard (unregularized) optimal transport.

Key differences from standard LAP solvers:

Use sinkhorn_to_assignment() to round the soft assignment to a hard matching.

Value

A list with elements:

References

Cuturi, M. (2013). 'Sinkhorn Distances': Lightspeed Computation of Optimal Transport. Advances in Neural Information Processing Systems, 26.

See Also

assignment() for hard 1-to-1 matching, sinkhorn_to_assignment() to round soft assignments.

Examples

cost <- matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9), nrow = 3, byrow = TRUE)

# Soft assignment with default parameters
result <- sinkhorn(cost)
print(round(result$transport_plan, 3))

# Sharper assignment (higher lambda)
result_sharp <- sinkhorn(cost, lambda = 50)
print(round(result_sharp$transport_plan, 3))

# With custom marginals (more mass from row 1)
result_weighted <- sinkhorn(cost, r_weights = c(0.5, 0.25, 0.25))
print(round(result_weighted$transport_plan, 3))

# Round to hard assignment
hard_match <- sinkhorn_to_assignment(result)
print(hard_match)


Round 'Sinkhorn' transport plan to hard assignment

Description

Convert a soft transport plan from sinkhorn() to a hard 1-to-1 assignment using greedy rounding.

Usage

sinkhorn_to_assignment(result)

Arguments

result

Either a result from sinkhorn() or a transport plan matrix.

Details

Greedy rounding iteratively assigns each row to its most probable column, ensuring no column is assigned twice. This may not give the globally optimal hard assignment; for that, use the transport plan as a cost matrix with assignment().

Value

Integer vector of column assignments (1-based), same format as assignment().

See Also

sinkhorn()

Examples

cost <- matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9), nrow = 3, byrow = TRUE)
result <- sinkhorn(cost, lambda = 20)
hard_match <- sinkhorn_to_assignment(result)
print(hard_match)


Calculate Standardized Difference

Description

Computes the standardized mean difference between two groups. This is a key metric for assessing balance in matched samples.

Usage

standardized_difference(x1, x2, pooled = TRUE)

Arguments

x1

Numeric vector for group 1

x2

Numeric vector for group 2

pooled

Logical, if TRUE use pooled standard deviation (default), if FALSE use group 1 standard deviation

Details

Standardized difference = (mean1 - mean2) / pooled_sd where pooled_sd = sqrt((sd1^2 + sd2^2) / 2)

Common thresholds: less than 0.1 is excellent balance, 0.1-0.25 is good balance, 0.25-0.5 is acceptable balance, and greater than 0.5 is poor balance.

Value

Numeric value representing the standardized difference


Perfect balance success message

Description

Perfect balance success message

Usage

success_good_balance(mean_std_diff)

Value

No return value, called for side effects (issues a message).


Suggest scaling method based on variable characteristics

Description

Analyzes variable distributions and suggests appropriate scaling methods.

Usage

suggest_scaling(left, right, vars)

Arguments

left

Data frame of left units

right

Data frame of right units

vars

Character vector of variable names

Value

A character string with the suggested scaling method: "standardize", "range", "robust", or "none"


Summarize block structure

Description

Summarize block structure

Usage

summarize_blocks(left, right, block_vars = NULL)

Value

Tibble with block_id, n_left, n_right, and optional variable means.


Summary method for balance diagnostics

Description

Summary method for balance diagnostics

Usage

## S3 method for class 'balance_diagnostics'
summary(object, ...)

Arguments

object

A balance_diagnostics object

...

Additional arguments (ignored)

Value

A list containing summary statistics (invisibly)


Summary Method for Distance Objects

Description

Summary Method for Distance Objects

Usage

## S3 method for class 'distance_object'
summary(object, ...)

Arguments

object

A distance_object

...

Additional arguments (ignored)

Value

Invisibly returns the input object.


Get summary of k-best results

Description

Extract summary information from k-best assignment results.

Usage

## S3 method for class 'lap_solve_kbest_result'
summary(object, ...)

Arguments

object

An object of class lap_solve_kbest_result.

...

Additional arguments (unused).

Value

A tibble with one row per solution containing:


Summary method for matching results

Description

Summary method for matching results

Usage

## S3 method for class 'matching_result'
summary(object, ...)

Arguments

object

A matching_result object

...

Additional arguments (ignored)

Value

A list containing summary statistics (invisibly)


Update Constraints on Distance Object

Description

Apply new constraints to a precomputed distance object without recomputing the underlying distances. This is useful for exploring different constraint scenarios quickly.

Usage

update_constraints(dist_obj, max_distance = Inf, calipers = NULL)

Arguments

dist_obj

A distance_object from compute_distances()

max_distance

Maximum allowed distance (pairs with distance > max_distance become Inf)

calipers

Named list of per-variable calipers

Details

This function creates a new distance_object with modified constraints applied to the cost matrix. The original distance_object is not modified.

Constraints:

The function returns a new object rather than modifying in place, following R's copy-on-modify semantics.

Value

A new distance_object with updated cost_matrix

Examples

left <- data.frame(id = 1:5, age = c(25, 30, 35, 40, 45))
right <- data.frame(id = 6:10, age = c(24, 29, 36, 41, 44))
dist_obj <- compute_distances(left, right, vars = "age")

# Apply constraints
constrained <- update_constraints(dist_obj, max_distance = 2)
result <- match_couples(constrained)


Check if emoji should be used

Description

Check if emoji should be used

Usage

use_emoji()

Value

Logical indicating whether emoji should be used.


Validate calipers parameter

Description

Validate calipers parameter

Usage

validate_calipers(calipers, vars)

Value

Validated calipers (list or named numeric), or NULL if none.


Validate and prepare cost data

Description

Internal helper that ensures a numeric, non-empty cost matrix.

Usage

validate_cost_data(x, forbidden = NA)

Arguments

x

Cost matrix or data frame

forbidden

Value representing forbidden assignments (use NA or Inf)

Value

Numeric cost matrix


Validate matching inputs

Description

Validate matching inputs

Usage

validate_matching_inputs(left, right, vars = NULL)

Value

Invisibly returns TRUE if validation passes; otherwise throws an error.


Validate weights parameter

Description

Validate weights parameter

Usage

validate_weights(weights, vars)

Value

Numeric vector of validated weights.


All distances identical warning

Description

All distances identical warning

Usage

warn_constant_distance(value)

Value

No return value, called for side effects (issues a warning).


Constant variable warning

Description

Constant variable warning

Usage

warn_constant_var(var)

Value

No return value, called for side effects (issues a warning).


Extreme cost ratio warning

Description

Extreme cost ratio warning

Usage

warn_extreme_costs(p95, p99, ratio, problem_vars = NULL)

Value

No return value, called for side effects (issues a warning).


Many forbidden pairs warning

Description

Many forbidden pairs warning

Usage

warn_many_forbidden(pct_forbidden, n_valid, n_left)

Value

No return value, called for side effects (issues a warning).


Too many zeros warning

Description

Too many zeros warning

Usage

warn_many_zeros(pct, n_zeros)

Value

No return value, called for side effects (issues a warning).


Parallel package missing warning (reuse from matching_parallel.R)

Description

Parallel package missing warning (reuse from matching_parallel.R)

Usage

warn_parallel_unavailable()

Value

No return value, called for side effects (issues a warning).


High distance matches warning

Description

High distance matches warning

Usage

warn_poor_quality(pct_poor, threshold)

Value

No return value, called for side effects (issues a warning).

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.