Hurdle Demand Models

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Brent Kaplan

2026-03-02

Introduction

The beezdemand package includes functionality for fitting two-part mixed effects hurdle demand models using Template Model Builder (TMB). This approach is particularly useful when:

Zero consumption values are informative (i.e., represent true non-consumption rather than censored data)
You want to model both the probability of consuming and the intensity of consumption simultaneously
Individual-level heterogeneity is important for both parts of the model

When to Use Hurdle Models vs. Standard Models

Scenario	Recommended Approach
Few zeros, zeros are measurement artifacts	`fit_demand_fixed()` or `fit_demand_mixed()`
Many zeros, zeros represent true non-consumption	`fit_demand_hurdle()`
Need to understand factors affecting whether someone consumes	`fit_demand_hurdle()`
Need individual-level estimates of “quitting price”	`fit_demand_hurdle()`

Model Specification

The hurdle model has two parts:

Part I: Binary Model (Probability of Zero Consumption)

\[\text{logit}(\pi_{ij}) = \beta_0 + \beta_1 \cdot \log(\text{price} + \epsilon) + a_i\]

Where:

\(\pi_{ij}\) = probability of zero consumption for subject \(i\) at price \(j\)
\(\beta_0\) = intercept (baseline log-odds of zero consumption)
\(\beta_1\) = slope (effect of log-price on log-odds of zero)
\(\epsilon\) = small constant (default 0.001) for handling zero prices
\(a_i\) = subject-specific random intercept

Part II: Continuous Model (Consumption Given Positive)

With 3 random effects: \[\log(Q_{ij}) = (\log Q_0 + b_i) + k \cdot (\exp(-(\alpha + c_i) \cdot \text{price}) - 1) + \varepsilon_{ij}\]

With 2 random effects: \[\log(Q_{ij}) = (\log Q_0 + b_i) + k \cdot (\exp(-\alpha \cdot \text{price}) - 1) + \varepsilon_{ij}\]

Where:

\(Q_0\) = intensity (consumption at price 0)
\(k\) = scaling parameter for exponential decay
\(\alpha\) = elasticity parameter
\(b_i\) = subject-specific random effect on intensity
\(c_i\) = subject-specific random effect on elasticity (3-RE model only)

Random Effects Structure

The random effects follow a multivariate normal distribution:

2-RE model: \((a_i, b_i) \sim \text{MVN}(0, \Sigma_{2 \times 2})\)
3-RE model: \((a_i, b_i, c_i) \sim \text{MVN}(0, \Sigma_{3 \times 3})\)

Getting Started

Data Requirements

Your data should be in long format with columns for:

Subject identifier
Price
Consumption (including zeros)

library(beezdemand)

# Example data structure
knitr::kable(
    head(apt, 10),
    caption = "Example APT data structure (first 10 rows)"
)

Example APT data structure (first 10 rows)
id	x	y	group
19	0.0	10	a
19	0.5	10	a
19	1.0	10	a
19	1.5	8	a
19	2.0	8	a
19	2.5	8	a
19	3.0	7	a
19	4.0	7	a
19	5.0	7	a
19	6.0	6	a

Basic Model Fitting

# Fit 2-RE model (simpler, faster)
fit2 <- fit_demand_hurdle(
    data = apt,
    y_var = "y",
    x_var = "x",
    id_var = "id",
    random_effects = c("zeros", "q0"), # 2 random effects
    verbose = 0
)

# Fit 3-RE model (more flexible)
fit3 <- fit_demand_hurdle(
    data = apt,
    y_var = "y",
    x_var = "x",
    id_var = "id",
    random_effects = c("zeros", "q0", "alpha"), # 3 random effects
    verbose = 1
)

Interpreting Output

# View summary
summary(fit2)
#> 
#> Two-Part Mixed Effects Hurdle Demand Model
#> ============================================
#> 
#> Call:
#> fit_demand_hurdle(data = apt, y_var = "y", x_var = "x", id_var = "id", 
#>     random_effects = c("zeros", "q0"), verbose = 0)
#> 
#> Convergence: Yes 
#> Number of subjects: 10 
#> Number of observations: 160 
#> Random effects: 2 (zeros, q0) 
#> 
#> Fixed Effects:
#> --------------
#>              Estimate Std. Error t value
#> beta0      -392.08454  146.77234  -2.671
#> beta1       135.79413   50.43547   2.692
#> log_q0        1.93813    0.11712  16.548
#> log_k         0.55505    0.09013   6.158
#> log_alpha    -2.31478    0.16822 -13.760
#> logsigma_a    5.76791    1.48290   3.890
#> logsigma_b   -1.04114    0.22837  -4.559
#> logsigma_e   -1.63184    0.06063 -26.914
#> rho_ab_raw   -0.19191    0.23755  -0.808
#> 
#> Variance Components:
#> --------------------
#>           Estimate  Std. Error
#> alpha       0.0988      0.0166
#> k           1.7420      0.1570
#> var_a  102316.8751 303451.6066
#> var_b       0.1246      0.0569
#> cov_ab    -21.4107     12.6867
#> var_e       0.0382      0.0046
#> 
#> Correlations:
#> -------------
#>        Estimate Std. Error
#> rho_ab  -0.1896      0.229
#> 
#> Model Fit:
#> ----------
#>   Log-likelihood: 2.31
#>   AIC: 13.38
#>   BIC: 41.06
#> 
#> Demand Metrics (Group-Level):
#> -----------------------------
#>   Pmax (price at max expenditure): 20.0000
#>   Omax (max expenditure): 30.9810
#>   Q at Pmax: 1.5491
#>   Elasticity at Pmax: -0.4772
#>   Method: numerical_optimize_observed_domain
#> 
#> Derived Parameters (Individual-Level Summary):
#> ----------------------------------------------
#>   Q0 (Intensity):
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>   3.622   5.496   7.585   7.365   8.544  11.623 
#>   Alpha:
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#> 0.09879 0.09879 0.09879 0.09879 0.09879 0.09879 
#>   Breakpoint:
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>   7.591  13.549  16.183  17.986  21.796  34.419 
#>   Pmax:
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>      20      20      20      20      20      20 
#>   Omax:
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>   16.15   24.51   33.83   32.85   38.11   51.85

# Extract coefficients
coef(fit2)
#>         beta0         beta1    logsigma_a    logsigma_b    logsigma_e 
#> -392.08454145  135.79413071    5.76791495   -1.04113637   -1.63183778 
#>    rho_ab_raw            Q0         alpha             k 
#>   -0.19191226    6.94572417    0.09878808    1.74202557

# Standardized tidy summaries
tidy(fit2) |> head()
#> # A tibble: 6 × 9
#>   term        estimate std.error statistic  p.value component     estimate_scale
#>   <chr>          <dbl>     <dbl>     <dbl>    <dbl> <chr>         <chr>         
#> 1 beta0      -392.      147.         -2.67 7.55e- 3 zero_probabi… logit         
#> 2 beta1       136.       50.4         2.69 7.09e- 3 zero_probabi… logit         
#> 3 Q0            6.95      0.813       8.54 1.36e-17 consumption   natural       
#> 4 k             1.74      0.157      11.1  1.32e-28 consumption   natural       
#> 5 alpha         0.0988    0.0166      5.94 2.77e- 9 consumption   natural       
#> 6 logsigma_a    5.77      1.48        3.89 1.00e- 4 variance      natural       
#> # ℹ 2 more variables: term_display <chr>, estimate_internal <dbl>
glance(fit2)
#> # A tibble: 1 × 9
#>   model_class   backend  nobs n_subjects n_random_effects converged logLik   AIC
#>   <chr>         <chr>   <int>      <int>            <int> <lgl>      <dbl> <dbl>
#> 1 beezdemand_h… TMB       160         10                2 TRUE        2.31  13.4
#> # ℹ 1 more variable: BIC <dbl>

# Get subject-specific parameters
head(get_subject_pars(fit2))
#>    id       a_i        b_i        Q0      alpha breakpoint Pmax     Omax
#> 1  19 -88.44338  0.5148913 11.623368 0.09878808  34.419426   20 51.84539
#> 2  30 -21.33118 -0.6512299  3.621529 0.09878808  20.997058   20 16.15363
#> 3  38  40.35752 -0.2348862  5.491712 0.09878808  13.330757   20 24.49548
#> 4  60  31.73723  0.1663613  8.202899 0.09878808  14.204505   20 36.58857
#> 5  68  31.19353  0.4228981 10.601806 0.09878808  14.261495   20 47.28877
#> 6 106 116.82473 -0.2317221  5.509116 0.09878808   7.590564   20 24.57311

Diagnostics

Use check_demand_model() and the plotting helpers for quick post-fit checks.

check_demand_model(fit2)
#> 
#> Model Diagnostics
#> ================================================== 
#> Model class: beezdemand_hurdle 
#> 
#> Convergence:
#>   Status: Converged
#> 
#> Random Effects:
#> 
#> Residuals:
#>   Mean: 5.392e-05
#>   SD: 0.1896
#>   Range: [-0.6119, 0.4986]
#>   Outliers: 2 observations
#> 
#> --------------------------------------------------
#> Issues Detected (1):
#>   1. Detected 2 potential outliers (|resid| > 3)
#> 
#> Recommendations:
#>   - Investigate outlying observations
plot_residuals(fit2)$fitted

plot_qq(fit2)

Understanding Results

Fixed Effects

Parameter	Interpretation
`beta0`	Part I intercept: baseline log-odds of zero consumption
`beta1`	Part I slope: change in log-odds per unit increase in log(price)
`logQ0`	Log of intensity parameter (population average)
`k`	Scaling parameter for demand decay
`alpha`	Elasticity parameter (population average for 2-RE, mean for 3-RE)

Subject-Specific Parameters

The subject_pars data frame contains:

Parameter	Description
`a_i`	Random effect for Part I (zeros probability)
`b_i`	Random effect for Part II (intensity)
`c_i`	Random effect for alpha (3-RE model only)
`Q0`	Individual intensity: \(\exp(\log Q_0 + b_i)\)
`alpha`	Individual elasticity: \(\alpha + c_i\) (or just \(\alpha\) for 2-RE)
`breakpoint`	Price where P(zero) = 0.5: \(\exp(-(\beta_0 + a_i) / \beta_1) - \epsilon\)
`Pmax`	Price at maximum expenditure
`Omax`	Maximum expenditure

Model Selection: 2-RE vs 3-RE

When to Use Each

2-RE model: When you believe elasticity (\(\alpha\)) is relatively constant across subjects
3-RE model: When you believe elasticity varies meaningfully between subjects

Likelihood Ratio Test

# Compare nested models
compare_hurdle_models(fit3, fit2)

# Unified model comparison (AIC/BIC + LRT when appropriate)
compare_models(fit3, fit2)

# Output:
# Likelihood Ratio Test
# =====================
#            Model n_RE    LogLik df     AIC     BIC
#    Full (3 RE)    3 -1234.56 12 2493.12 2543.21
# Reduced (2 RE)    2 -1245.78  9 2509.56 2547.89
#
# LR statistic: 22.44
# df: 3
# p-value: 5.2e-05

A significant p-value suggests the 3-RE model provides a better fit.

Visualization

# Population demand curve
plot(fit2, type = "demand")

Population demand curve from 2-RE hurdle model.

# Probability of zero consumption
plot(fit2, type = "probability")

Probability of zero consumption as a function of price.

# Distribution of individual parameters
plot(fit2, type = "parameters")
plot(fit2, type = "parameters", parameters = c("Q0", "alpha", "Pmax"))

# Individual demand curves
plot(fit2, type = "individual", subjects = c("1", "2", "3", "4"))

# Single subject with observed data
plot_subject(fit2, subject_id = "1", show_data = TRUE, show_population = TRUE)

Simulation and Validation

Simulating Data

The simulate_hurdle_data() function generates data from the hurdle model:

# Simulate with default parameters
sim_data <- simulate_hurdle_data(
    n_subjects = 100,
    seed = 123
)

head(sim_data)
#   id    x         y delta       a_i        b_i
# 1  1 0.00 12.345678     0 -0.523456  0.1234567
# 2  1 0.50 11.234567     0 -0.523456  0.1234567
# ...

# Custom parameters
sim_custom <- simulate_hurdle_data(
    n_subjects = 100,
    logQ0 = log(15), # Q0 = 15
    alpha = 0.1, # Lower elasticity
    k = 3, # Higher k (ensures Pmax exists)
    stop_at_zero = FALSE, # All prices for all subjects
    seed = 456
)

Monte Carlo Simulation Studies

The run_hurdle_monte_carlo() function assesses model performance through simulation.

Note: Monte Carlo simulations are computationally intensive and not run during vignette building. The example below shows typical usage and expected output format.

# Run Monte Carlo study
mc_results <- run_hurdle_monte_carlo(
    n_sim = 100, # Number of simulations
    n_subjects = 100, # Subjects per simulation
    n_random_effects = 2, # 2-RE model
    verbose = TRUE,
    seed = 123
)

# View summary
print_mc_summary(mc_results)

# Monte Carlo Simulation Summary
# ==============================
#
# Simulations: 100 attempted, 95 converged (95.0%)
#
#    Parameter True Mean_Est   Bias Rel_Bias% Emp_SE Mean_SE SE_Ratio Coverage_95%  N
#        beta0 -2.0    -2.01  -0.01      0.5   0.12    0.11     0.92         94.7 95
#        beta1  1.0     1.02   0.02      2.0   0.08    0.08     1.00         95.8 95
#        logQ0  2.3     2.29  -0.01     -0.4   0.05    0.05     1.00         94.7 95
#            k  2.0     2.03   0.03      1.5   0.15    0.14     0.93         93.7 95
#        alpha  0.5     0.51   0.01      2.0   0.04    0.04     1.00         95.8 95
# ...

Interpreting Monte Carlo Results

Metric	Target	Interpretation
Bias	~0	Estimates should be unbiased
Relative Bias	< 5%	Acceptable bias relative to true value
SE Ratio	~1.0	Model SEs match empirical variability
Coverage 95%	~95%	CIs should contain true value 95% of time

Integration with beezdemand Workflow

Combining with Other Analyses

# Fit hurdle model
hurdle_fit <- fit_demand_hurdle(data,
    y_var = "y", x_var = "x", id_var = "id",
    random_effects = c("zeros", "q0"), verbose = 0
)

# Extract subject parameters
hurdle_pars <- get_subject_pars(hurdle_fit)

# These can be merged with other analyses
# e.g., correlate with individual characteristics
cor(hurdle_pars$Q0, subject_characteristics$age)
cor(hurdle_pars$breakpoint, subject_characteristics$dependence_score)

Exporting Results

# Subject parameters
write.csv(get_subject_pars(hurdle_fit), "hurdle_subject_parameters.csv")

# Summary statistics
summ <- get_hurdle_param_summary(hurdle_fit)
print(summ)

Technical Details

TMB Backend

The hurdle model is implemented using Template Model Builder (TMB), which provides:

Efficient Laplace approximation for marginal likelihood
Automatic differentiation for fast optimization
Standard errors via the delta method

Control Parameters

fit <- fit_demand_hurdle(
    data,
    y_var = "y",
    x_var = "x",
    id_var = "id",
    epsilon = 0.001, # Constant for log(price + epsilon)
    tmb_control = list(
        max_iter = 200, # Maximum iterations
        eval_max = 1000, # Maximum function evaluations
        trace = 0 # Optimization trace level
    ),
    verbose = 1 # 0 = silent, 1 = progress, 2 = detailed
)

Custom Starting Values

For difficult optimization problems:

custom_starts <- list(
    beta0 = -3.0,
    beta1 = 1.5,
    logQ0 = log(8),
    k = 2.5,
    alpha = 0.1,
    logsigma_a = 0.5,
    logsigma_b = -0.5,
    logsigma_e = -1.0,
    rho_ab_raw = 0
)

fit <- fit_demand_hurdle(data,
    y_var = "y", x_var = "x", id_var = "id",
    start_values = custom_starts, verbose = 1
)

References

Zhao, T., Luo, X., Chu, H., Le, C.T., Epstein, L.H. and Thomas, J.L. (2016), A two-part mixed effects model for cigarette purchase task data. Jrnl Exper Analysis Behavior, 106: 242-253. https://doi.org/10.1002/jeab.228

Hursh, S. R., & Silberberg, A. (2008). Economic demand and essential value. Psychological Review, 115(1), 186-198.

id	x	y	group
19	0.0	10	a
19	0.5	10	a
19	1.0	10	a
19	1.5	8	a
19	2.0	8	a
19	2.5	8	a
19	3.0	7	a
19	4.0	7	a
19	5.0	7	a
19	6.0	6	a

id	x	y	group
19	0.0	10	a
19	0.5	10	a
19	1.0	10	a
19	1.5	8	a
19	2.0	8	a
19	2.5	8	a
19	3.0	7	a
19	4.0	7	a
19	5.0	7	a
19	6.0	6	a

Hurdle Demand Models

Brent Kaplan

2026-03-02

Introduction

When to Use Hurdle Models vs. Standard Models

Model Specification

Part I: Binary Model (Probability of Zero Consumption)

Part II: Continuous Model (Consumption Given Positive)

Random Effects Structure

Getting Started

Data Requirements

Basic Model Fitting

Interpreting Output

Diagnostics

Understanding Results

Fixed Effects

Subject-Specific Parameters

Model Selection: 2-RE vs 3-RE

When to Use Each

Likelihood Ratio Test

Visualization

Simulation and Validation

Simulating Data

Monte Carlo Simulation Studies

Interpreting Monte Carlo Results

Integration with beezdemand Workflow

Combining with Other Analyses

Exporting Results

Technical Details

TMB Backend

Control Parameters

Custom Starting Values

References

See Also

id	x	y	group
19	0.0	10	a
19	0.5	10	a
19	1.0	10	a
19	1.5	8	a
19	2.0	8	a
19	2.5	8	a
19	3.0	7	a
19	4.0	7	a
19	5.0	7	a
19	6.0	6	a