The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

The Shell Game: conceptual framework

The Shell Game

Swapping the VAR without changing the name or the formula.

Same column name.
Different underlying quantity.

That’s the shell game.

What the Shell Actually Is

The shell isn’t geography.
The shell is the variable label.

If this were just geometric mismatch, we would have:

But because the estimate silently shifts, what we’re actually doing is:

Imputing population via a proxy and continuing to treat it as observed.

This is how the distortion compounds.

The Critical Shift

We’re no longer propagating data.
We’re propagating assumptions.

Assumptions are sticky. They don’t decay; they accumulate.

When you apply standard administrative crosswalks in succession, each step amplifies the prior steps assumptions, and allocation choices. And, adds its own. In this example case study, shellgame is using HUD and TOT_RATIO, which assumes and allocates, residential address distributions as a proxy for population. The resulting estimates retain the population label despite no longer representing directly observed population quantities. As a result, successive users inherit increasingly imputed values while treating them as empirical measurements.

The Transformation Chain

I am going to quantify mismatch at each hop, nothing more, nothing less.

ZCTA → ZIP → COUNTY

Pre-allocation

Some Census tabulation areas are being asked to stand in for multiple postal service geographies, with no guidance on how population should be divided.

This is the exact moment analysts silently switch from “joining data” to “inventing rules.”

At this stage, we:

have not: - applied TOT_RATIO - averaged anything - redistributed population

have only: - expanded relationships

Yet already: - The unit count has changed - The geography has fragmented - The analytical surface has shifted

Any downstream “fix” is compensating for damage already done, not refining a neutral process.

First Claim

We can already say, truthfully and precisely:

“Using a lookup-style ZCTA-ZIP association increases the number of spatial units representing a county by 32% (74 ZCTAs to 98 ZIPs), prior to any allocation or weighting.”

The Two Hidden Decisions

Decision 1: Membership Definition

Are we defining membership by administrative linkage, or by geometric contact?

  • 74 ZCTAs (relationship-based membership used in ACS)
  • vs 94 ZCTAs (geometric intersection with county boundary)
  • 20 ZCTAs excluded before any transformation begins

This affects the baseline before any allocation occurs.

Decision 2: Crosswalk Selection

When you write “Used HUD crosswalk 3rd quarter 2024” - what decision was made?

  • Choice of crosswalk source (HUD vs Census vs commercial)
  • Choice of time period (Q3 2024 vs Q4 2024 vs 2023)
  • Choice of allocation method (TOT_RATIO vs RES_RATIO vs BUS_RATIO)

Each choice produces different results, but they’re rarely acknowledged as methodological decisions.

The Result

For Hennepin County, Minnesota:

Input: Population at ZCTA level
Transformation: ZCTA → ZIP → County
Output: Population at each level
Result: delta(Population) = Baseline - Recovered

Why This Is Agnostic

The shell game happens regardless of:

The transformation is the cause, not the tool or variable.

The shell game happens regardless of who’s shuffling the shells.

What This Package Does

shellgame provides tools to:

  1. Quantify the error at each transformation hop
  2. Reveal where the swap from observed to imputed data occurs
  3. Demonstrate that the error is agnostic to inputs
  4. Document the hidden decisions in the workflow

Use it to audit yourself. Use it to understand what’s really happening when you transform geographic data.

Same column name. Different underlying quantity. That’s the shell game.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.