The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This vignette serves three purposes:
This vignette is different: Unlike the other couplr
documentation, it emphasizes understanding over doing.
If you’re looking to solve a matching problem today, start with
vignette("getting-started") or
vignette("matching-workflows"). Come here when you want to
understand why these algorithms work and when
approximations are appropriate.
Audience: Advanced users, researchers, algorithm developers, curious minds
Prerequisites:
lap_solve()
(vignette("getting-started"))What You’ll Learn:
Time to complete: 45-60 minutes (conceptual reading)
| Vignette | Focus | Difficulty |
|---|---|---|
| Getting Started | Basic LAP solving, API introduction | Beginner |
| Algorithms | Mathematical foundations, solver selection | Intermediate |
| Matching Workflows | Production matching pipelines | Intermediate |
| Pixel Morphing | Scientific applications, approximations | Advanced |
You are here: Pixel Morphing (Advanced)
Pixels provide an ideal testbed for understanding assignment problems:
The same algorithms that morph images smoothly also track particles in physics, align molecules in chemistry, and match vegetation plots in ecology.
| Domain | Entities | Features | Spatial |
|---|---|---|---|
| Ecology | Vegetation plots | Species composition | Geographic location |
| Physics | Particles | Intensity, size | Predicted position |
| Chemistry | Atoms | Element type | 3D coordinates |
| Images | Pixels | RGB color | (x, y) position |
Given two sets of entities \(A = \{a_1, \ldots, a_n\}\) and \(B = \{b_1, \ldots, b_n\}\), find the optimal one-to-one correspondence by minimizing
\[ \min_{\pi} \sum_{i=1}^{n} c_{i,\pi(i)} \]
where the cost combines feature similarity and spatial proximity:
\[ c_{ij} = \alpha \, d_{\text{feature}}(a_i, b_j) + \beta \, d_{\text{spatial}}(\mathbf{x}_i, \mathbf{x}_j). \]
Feature distance \(d_{\text{feature}}\): domain-specific similarity
Spatial distance \(d_{\text{spatial}}\): physical proximity
Weights \(\alpha, \beta \ge 0\) balance feature matching vs. spatial coherence.
Exact solution: solve the full \(n \times n\) LAP.
Real applications often involve
We therefore need approximations that are much faster but still produce high-quality matchings.
To make the abstract ideas concrete, we visualize them using image morphing where
We first show the static input images (all at \(80 \times 80\) for display), then the animated morphs produced by different matching strategies.
The first pair are real photographs, the second pair are simple geometric shapes. Internally, all matching is computed on logical \(40 \times 40\) grids; we then upscale to \(80 \times 80\) purely for clearer display.
The exact pixel morph uses a full LAP solution on a \(1600 \times 1600\) cost matrix. For each pair of pixels \((i, j)\) we compute
\[ c_{ij} = \alpha \,\lVert \text{RGB}_i^A - \text{RGB}_j^B \rVert_2 + \beta \,\lVert (x_i, y_i) - (x_j, y_j) \rVert_2, \]
where color distances are normalized to \([0, \sqrt{3}]\) (RGB in \([0,1]\)) and spatial distances to \([0,1]\) using the image diagonal.
This yields an optimal one-to-one assignment of pixels. The resulting animations are smooth and artifact-free but require solving the full LAP.
In the feature quantization morph, similar colors are grouped, and groups are matched rather than individual pixels. Colors move as coherent “bands,” preserving global color structure but losing fine-grained per-pixel detail.
The hierarchical morph first matches large patches, then refines within patches. The motion is locally coherent and scales well to large problems, at the price of potentially missing some globally optimal cross-patch matches.
We now describe the three approximation strategies in detail. The animations above correspond directly to these methods.
Core idea: reduce problem size by grouping entities with similar features, then match groups.
Map continuous feature space to a finite palette
\[ \text{quantize}: \mathbb{R}^d \to \{1, \ldots, k\}, \]
where \(k \ll n\) (for example \(k \approx 64\) for \(n = 1600\)).
Form groups \[ G_A^{(c)} = \{ i : \text{quantize}(f_i) = c \} \] and similarly for \(B\).
Solve a \(k \times k\) LAP between palette entries with costs
\[ c'_{ij} = \alpha \, d(p_i, p_j) + \beta \, d(\bar{\mathbf{x}}_i, \bar{\mathbf{x}}_j), \]
where \(p_i\) is the palette color and \(\bar{\mathbf{x}}_i\) the centroid position of group \(i\).
Every entity in \(G_A^{(c)}\) is assigned according to the group-to-group match.
For example, with \(n = 1600\) (a \(40 \times 40\) image) and \(k = 64\) you get
\[ \left(\frac{1600}{64}\right)^3 = 25^3 \approx 15\,000 \]
times fewer LAP operations.
Advantages
Disadvantages
The corresponding GIFs are the color walk morphs shown earlier.
Core idea: split the domain into smaller subproblems by spatial partitioning, solve subproblems, and combine.
Divide the domain into \(m \times m\) patches (for example \(m = 4\) so you get \(16\) patches). Denote the subset of entities of \(A\) in patch \(k\) by
\[ P_A^{(k)} = \{ a_i : \mathbf{x}_i \in \text{Patch}_k \}. \]
Form patch representatives: centroid position and mean features per patch. Solve an \(m^2 \times m^2\) LAP between patches, with costs defined using the same feature and spatial distances but now at patch level.
Within each matched patch pair \((P_A^{(k)}, P_B^{(l)})\):
Concatenate assignments from all leaf subproblems to obtain the global matching.
With \(d\) levels of decomposition (each level splitting into four patches), the work can be made close to \(O(n \log n)\) in practice, compared to \(O(n^3)\) for a single full LAP. Intuitively, the LAPs near the leaves are very small, and the costly large LAP is replaced by a series of much smaller ones.
Advantages
Disadvantages
// Pseudocode for hierarchical LAP matching
FUNCTION match_hierarchical(region_A, region_B, threshold, level):
// Base case: region small enough for exact LAP
IF size(region_A) <= threshold THEN
cost ← compute_cost_matrix(region_A, region_B, α, β)
RETURN lap_solve(cost)
END IF
// Divide into 2×2 spatial grid (4 patches)
patches_A ← spatial_partition(region_A, grid = 2×2)
patches_B ← spatial_partition(region_B, grid = 2×2)
// Compute patch representatives
FOR each patch p DO
centroid[p] ← mean(positions in p)
mean_feature[p] ← mean(features in p)
END FOR
// Match patches using 4×4 LAP
patch_cost ← matrix(4, 4)
FOR i = 1 TO 4 DO
FOR j = 1 TO 4 DO
patch_cost[i, j] ← α·distance(mean_feature_A[i], mean_feature_B[j]) +
β·distance(centroid_A[i], centroid_B[j])
END FOR
END FOR
patch_assignment ← lap_solve(patch_cost)
// Recursively solve within matched patches
assignments ← []
FOR i = 1 TO 4 DO
j ← patch_assignment[i]
sub_assignment ← match_hierarchical(
patches_A[i],
patches_B[j],
threshold,
level + 1
)
assignments ← append(assignments, sub_assignment)
END FOR
RETURN concatenate(assignments)
END FUNCTION
The couplr implementation adds pragmatic details such as
normalization of color and spatial distances, conversion between \((x, y)\) coordinates and raster indexing,
and handling remainder patches when the grid does not divide evenly.
Core idea: solve the LAP on a coarse grid, then lift/upscale the assignment to the full-resolution grid.
Reduce spatial resolution by a factor \(s\) (for example \(s = 2\)):
\[ A' = \text{downsample}(A, s), \qquad B' = \text{downsample}(B, s). \]
Now \(A'\) and \(B'\) each have \(n' = n / s^2\) entities.
Compute an exact LAP solution on the \(n' \times n'\) problem:
\[ \pi' = \arg\min_{\pi'} \sum_{i=1}^{n'} c'_{i,\pi'(i)}. \]
Map the low-resolution assignment back to full resolution:
\[ \pi(i) = \text{upscale}\!\bigl(\pi'(\text{coarse\_index}(i)), s\bigr), \]
where each full-resolution entity inherits the assignment of its coarse cell.
For \(s = 2\) this gives a \(64\times\) reduction in LAP work.
Advantages
Disadvantages
In practice, resolution reduction is most useful as a crude initialization step for very large problems (\(n > 100\,000\)).
| Approach | Speedup (vs. exact) | Quality | Best for |
|---|---|---|---|
| Exact LAP | \(1\times\) | Optimal | \(n \le 1000\) |
| Feature quantization | \((n/k)^3\) | Good global structure | Distinct feature groups |
| Hierarchical | \(\approx n^{3/2}\) | Good local structure | Large \(n\), strong spatial structure |
| Resolution reduction | \(s^6\) | Moderate | Very large \(n\), rough initialization |
Practical rules of thumb
We now spell out the exact LAP-based morph more concretely.
We again use the cost
\[ c_{ij} = \alpha \,\lVert \text{RGB}_i^A - \text{RGB}_j^B \rVert_2 + \beta \,\lVert (x_i, y_i) - (x_j, y_j) \rVert_2. \]
The algorithm:
// Pseudocode for exact pixel matching
// Step 1: Compute full cost matrix (normalized)
n_pixels ← height × width
cost ← matrix(0, n_pixels, n_pixels)
FOR i = 1 TO n_pixels DO
FOR j = 1 TO n_pixels DO
// RGB color distance (normalized to [0, sqrt(3)])
color_dist ← sqrt((R_A[i] - R_B[j])^2 +
(G_A[i] - G_B[j])^2 +
(B_A[i] - B_B[j])^2) / (255 · sqrt(3))
// Spatial distance (normalized to [0, 1] by diagonal)
spatial_dist ← sqrt((x_A[i] - x_B[j])^2 +
(y_A[i] - y_B[j])^2) / diagonal_length
// Combined cost
cost[i, j] ← α · color_dist + β · spatial_dist
END FOR
END FOR
// Step 2: Solve with Jonker-Volgenant
assignment ← lap_solve(cost, method = "jv")
// Step 3: Generate morph frames by linear interpolation
FOR frame_idx = 1 TO n_frames DO
t ← frame_idx / n_frames // Time parameter in [0, 1]
FOR pixel_i = 1 TO n_pixels DO
j ← assignment[pixel_i] // Matched target pixel
// Interpolate position
x_new[pixel_i] ← (1 - t) · x_A[pixel_i] + t · x_B[j]
y_new[pixel_i] ← (1 - t) · y_A[pixel_i] + t · y_B[j]
// Keep source color (transport-only, no blending)
RGB_new[pixel_i] ← RGB_A[pixel_i]
END FOR
frames[frame_idx] ← render(x_new, y_new, RGB_new)
END FOR
The couplr implementation handles indexing, raster
layout, and shows or saves the resulting GIFs.
Approximate performance: up to about \(100 \times 100\) (10 000 pixels) on typical hardware is fine with the exact LAP.
We now return from pixel morphs to the scientific settings that motivated them.
Problem: match \(n\) vegetation plots surveyed at time \(t\) to \(n\) plots at time \(t + \Delta t\) to track community dynamics.
Feature distance: Bray-Curtis dissimilarity between species abundance vectors
\[ d_{\text{BC}}(a, b) = \frac{\sum_s \lvert a_s - b_s \rvert} {\sum_s (a_s + b_s)}, \]
where \(a_s, b_s\) are abundances of species \(s\) in plots \(a\) and \(b\).
Spatial distance: geographic distance (e.g. in kilometers) between plot centers.
Exact solution for small studies (\(n < 100\)):
// Pseudocode for ecological plot matching
FOR i = 1 TO n_plots_t DO
FOR j = 1 TO n_plots_tplus DO
// Bray-Curtis dissimilarity for species composition
numerator ← sum over species s of |abundance_t[i, s] - abundance_tplus[j, s]|
denominator ← sum over species s of (abundance_t[i, s] + abundance_tplus[j, s])
bc_distance ← numerator / denominator
// Geographic distance (kilometers)
geo_distance ← sqrt((x_t[i] - x_tplus[j])^2 +
(y_t[i] - y_tplus[j])^2)
// Combined cost (α = 0.7 emphasizes species composition)
cost[i, j] ← 0.7 · bc_distance + 0.3 · (geo_distance / max_distance)
END FOR
END FOR
plot_correspondence ← lap_solve(cost)
For large studies (\(n > 1000\)) a hierarchical approach by region is more practical:
// Hierarchical decomposition by geographic region
// 1. Divide landscape into spatial grid (e.g. 10 km × 10 km cells)
regions_t ← spatial_partition(plots_t, grid_size = 10 km)
regions_tplus ← spatial_partition(plots_tplus, grid_size = 10 km)
// 2. Compute region representatives
FOR each region r DO
mean_composition[r] ← average species vector across plots in r
centroid[r] ← geographic center of r
END FOR
// 3. Match regions (small LAP: ~100 regions)
region_cost ← compute_cost(mean_composition, centroids, α = 0.7, β = 0.3)
region_assignment ← lap_solve(region_cost)
// 4. Within matched regions, solve plot-level LAP
full_assignment ← []
FOR r = 1 TO n_regions DO
r_matched ← region_assignment[r]
plots_A ← plots in regions_t[r]
plots_B ← plots in regions_tplus[r_matched]
// Local LAP (smaller problem, e.g. 50 × 50)
cost_local ← compute_plot_cost(plots_A, plots_B, α = 0.7, β = 0.3)
local_assignment ← lap_solve(cost_local)
full_assignment ← append(full_assignment, local_assignment)
END FOR
RETURN full_assignment
This allows tracking individual plot trajectories across time, distinguishing stable communities, successional trends, and invasion fronts.
Problem: track \(n\) particles between frame \(t\) and \(t + \Delta t\) in experimental video.
Feature distance: differences in intensity, size, or shape.
Spatial distance: displacement relative to predicted motion:
\[ d_{\text{spatial}}(i, j) = \bigl\| \mathbf{x}_i + \mathbf{v}_i \Delta t - \mathbf{x}_j \bigr\|_2, \]
where \(\mathbf{v}_i\) is the estimated velocity from previous frames.
We also impose a maximum displacement \(d_{\max}\) beyond which matches are physically implausible.
Exact solution (moderate \(n\)):
// Pseudocode for particle tracking with velocity prediction
// Initialize cost matrix as forbidden everywhere
cost ← matrix(Inf, n_particles_t, n_particles_tplus)
FOR i = 1 TO n_particles_t DO
// Predict position using previous velocity
x_predicted ← x_t[i] + v_x_t[i] · Δt
y_predicted ← y_t[i] + v_y_t[i] · Δt
FOR j = 1 TO n_particles_tplus DO
// Distance from predicted position
dx ← x_predicted - x_tplus[j]
dy ← y_predicted - y_tplus[j]
spatial_distance ← sqrt(dx^2 + dy^2)
// Only consider physically plausible matches
IF spatial_distance <= max_displacement THEN
// Feature similarity (intensity, size, etc.)
feature_distance ← |intensity_t[i] - intensity_tplus[j]|
// Combined cost
cost[i, j] ← α · feature_distance + β · spatial_distance
END IF
END FOR
END FOR
// Solve assignment (Inf entries are forbidden)
particle_tracks ← lap_solve(cost)
// Update velocities from assignments
FOR i = 1 TO n_particles_t DO
j ← particle_tracks[i]
velocity_new[i] ← (position_tplus[j] - position_t[i]) / Δt
END FOR
For dense tracking (\(n > 5000\)), we can first cluster particles:
// Two-stage: clustering then local matching
// Stage 1: spatial clustering
clusters_t ← spatial_cluster(particles_t, radius = 2 · pixel_size)
clusters_tplus ← spatial_cluster(particles_tplus, radius = 2 · pixel_size)
// Compute cluster representatives
FOR each cluster c DO
centroid[c] ← mean position of particles in c
mean_intensity[c] ← mean intensity
mean_velocity[c] ← mean velocity (if available)
END FOR
// Match clusters
cluster_cost ← compute_cluster_similarity(clusters_t, clusters_tplus)
cluster_tracks ← lap_solve(cluster_cost)
// Stage 2: within matched clusters, track individual particles
full_tracks ← []
FOR c = 1 TO n_clusters DO
c_matched ← cluster_tracks[c]
particles_A ← particles in clusters_t[c]
particles_B ← particles in clusters_tplus[c_matched]
cost_local ← compute_particle_distance(
particles_A, particles_B,
max_displacement = 5,
α = 0.3,
β = 0.7
)
local_tracks ← lap_solve(cost_local)
full_tracks ← append(full_tracks, local_tracks)
END FOR
RETURN full_tracks
This yields efficient and robust trajectories even for very dense particle fields.
Problem: align two conformations of the same molecule (e.g. a protein) with \(n\) atoms to compute RMSD and analyze structural change.
Feature distance: strict element matching
\[ d_{\text{element}}(i, j) = \begin{cases} 0, & \text{if } \text{element}_i = \text{element}_j, \\ \infty, & \text{otherwise.} \end{cases} \]
Spatial distance: 3D Euclidean distance between atomic coordinates.
Exact LAP for small molecules:
// Pseudocode for molecular conformation alignment
n_atoms ← number of atoms in molecule
cost ← matrix(0, n_atoms, n_atoms)
FOR i = 1 TO n_atoms DO
FOR j = 1 TO n_atoms DO
// Enforce strict element type matching
IF element_type_A[i] ≠ element_type_B[j] THEN
cost[i, j] ← Inf
ELSE
dx ← x_A[i] - x_B[j]
dy ← y_A[i] - y_B[j]
dz ← z_A[i] - z_B[j]
cost[i, j] ← sqrt(dx^2 + dy^2 + dz^2)
END IF
END FOR
END FOR
// Solve alignment
alignment ← lap_solve(cost)
// Compute RMSD
sum_sq_dist ← 0
FOR i = 1 TO n_atoms DO
j ← alignment[i]
sum_sq_dist ← sum_sq_dist + cost[i, j]^2
END FOR
rmsd ← sqrt(sum_sq_dist / n_atoms)
For large biomolecules, we again use a hierarchical strategy, this time by secondary structure elements (helices, sheets, loops, etc.), aligning segments first and then atoms within matched segments.
The morphing examples use default settings, but you can customize the number of frames and speed:
# From inst/scripts/generate_examples.R
generate_morph <- function(assignment, pixels_A, pixels_B,
n_frames = 30, # number of frames
frame_delay = 0.1) # delay between frames (seconds)
{
frames <- lapply(seq(0, 1, length.out = n_frames), function(t) {
interpolate_frame(t, assignment, pixels_A, pixels_B)
})
save_gif(frames, delay = frame_delay)
}Total animation duration is n_frames * frame_delay
seconds.
The morphing implementation is provided in
inst/scripts/generate_examples.R:
The matching problems discussed here are discrete instances of optimal transport.
The original Monge formulation (1781) seeks a transport map \(T: A \to B\) minimizing
\[ \int_A c(\mathbf{x}, T(\mathbf{x})) \,\mathrm{d}\mu(\mathbf{x}). \]
Kantorovich (1942) relaxed this to a transport plan \(\gamma\) on \(A \times B\):
\[ \min_{\gamma} \int_{A \times B} c(\mathbf{x}, \mathbf{y}) \,\mathrm{d}\gamma(\mathbf{x}, \mathbf{y}) \]
subject to marginal constraints on \(\gamma\).
For discrete uniform distributions with \(n\) points in \(A\) and \(B\) we obtain exactly the linear assignment problem:
\[ \min_{\pi \in S_n} \sum_{i=1}^n c_{i,\pi(i)}, \]
which is what couplr solves efficiently.
With \(c_{ij} = d(\mathbf{x}_i, \mathbf{x}_j)\) (often Euclidean distance), the optimal cost defines the \(1\)‑Wasserstein distance:
\[ W_1(\mu, \nu) = \min_{\pi \in S_n} \sum_{i=1}^n c_{i,\pi(i)}. \]
This appears in
Ecology
Physics
Chemistry
vignette("algorithms").The approximation strategies in this vignette become relevant when working with couplr’s practical matching functions.
| Strategy | couplr Implementation | When to Use |
|---|---|---|
| Exact LAP | match_couples(method = "jv") |
n < 3,000 |
| Feature quantization | Implicit via scale = "robust" |
Reduces effective feature space |
| Hierarchical | matchmaker(block_type = "cluster") |
n > 3,000, use blocking |
| Resolution reduction | Custom code | Very large n |
For n < 3,000: Use match_couples()
with exact algorithms:
For 3,000 < n < 10,000: Use blocking to create smaller subproblems:
blocks <- matchmaker(left, right, block_type = "cluster", n_blocks = 10)
result <- match_couples(blocks$left, blocks$right, vars = vars, block_id = "block_id")For n > 10,000: Use greedy matching:
For n > 50,000: Combine strategies (blocking + greedy within blocks), or implement custom approximations using the techniques in this vignette.
Each approximation trades accuracy for speed. Know the failure modes:
| Strategy | Works Well When | Fails When |
|---|---|---|
| Feature quantization | Features cluster naturally | Continuous features, fine distinctions matter |
| Hierarchical | Spatial locality is meaningful | Optimal matches cross boundaries |
| Resolution reduction | Coarse structure suffices | Fine detail matters |
This vignette explored optimal matching through the lens of pixel morphing and scientific applications.
Key Ideas:
The same algorithms that morph images smoothly also track particles
in physics, align molecules in chemistry, and match vegetation plots in
ecology. Together, the methods in couplr let you move
between exact optimal matchings and principled approximations, depending
on problem size and accuracy requirements.
vignette("getting-started") - Basic LAP solvingvignette("algorithms") - Mathematical foundationsvignette("matching-workflows") - Production matching
pipelines?lap_solve, ?match_couples,
?greedy_couplesThese binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.