The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Get started with mixqr

Kailas Venkitasubramanian

mixqr is an extensible framework for finite mixtures of quantile (and expectile) regressions: at its core it finds hidden subgroups in your data and fits a separate quantile regression in each. This page is a five-minute tour of that core; the Tutorial is the full guide, and the Extensions article covers the expectile/M-quantile families, penalized selection, and non-crossing multi-quantile estimation built on the same platform.

library(mixqr)

A two-regime example

The engine data (Brinkman 1981) record the equivalence ratio (richness of the air/fuel mix) against nitrous-oxide concentration for a test engine. A single line fits badly; there are two regimes.

fit <- mixqr(equivalence ~ nox, data = engine, tau = 0.5, m = 2,
             variance = "stochEM")
fit
#> Mixture of quantile regressions (mixqr)
#>   engine: ald   tau = 0.5   components m = 2   n = 88
#>   converged: TRUE in 19 iterations
#> 
#> Mixing probabilities (pi):
#>  comp1  comp2 
#> 0.5081 0.4919 
#> 
#> Component coefficients (beta):
#>               comp1  comp2
#> (Intercept)  1.2428 0.5568
#> nox         -0.0835 0.0909
#> 
#> logLik = 113.995   AIC = -213.99   BIC = -196.65

mixqr() has jointly (i) split the observations into two groups and (ii) estimated a median regression in each. summary() adds standard errors:

summary(fit)
#> Mixture of quantile regressions (mixqr) -- summary
#>   engine: ald   tau = 0.5   m = 2   n = 88
#> 
#> Component 1  (pi = 0.5081):
#>              Estimate   Std.Err z value Pr(>|z|)    
#> (Intercept)  1.242800  0.012233  101.59   <2e-16 ***
#> nox         -0.083498  0.006812  -12.26   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Component 2  (pi = 0.4919):
#>             Estimate Std.Err z value Pr(>|z|)    
#> (Intercept)  0.55682 0.02401  23.191   <2e-16 ***
#> nox          0.09091 0.01044   8.706   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Missing-information fraction (separability): 0.156
#> Responsibility overlap (0 = separated, 1 = overlapping): 0.129
#> 
#> logLik = 113.995 (ALD working likelihood)   AIC = -213.99   BIC = -196.65

A first picture

A little ggplot2 shows the two recovered regimes and their median lines.

library(ggplot2)

dat <- transform(engine, regime = factor(predict(fit, type = "class")))
grid <- data.frame(nox = seq(min(engine$nox), max(engine$nox), length.out = 100))
lines <- do.call(rbind, lapply(1:2, function(j) {
  data.frame(nox = grid$nox,
             equivalence = cbind(1, grid$nox) %*% fit$beta[, j],
             regime = factor(j))
}))

ggplot(dat, aes(nox, equivalence, colour = regime)) +
  geom_point(size = 2, alpha = 0.8) +
  geom_line(data = lines, linewidth = 1.1) +
  scale_colour_manual(values = c("#1b6ca8", "#e07b39")) +
  labs(x = "Nitrous oxide", y = "Equivalence ratio",
       title = "Two median regimes recovered by mixqr") +
  theme_minimal(base_size = 12)

Engine data coloured by recovered regime with two median regression lines.

Where to next

citation("mixqr")

References

Brinkman, N. D. 1981. “Ethanol Fuel – a Single-Cylinder Engine Study of Efficiency and Exhaust Emissions.” SAE Transactions 90: 1410–27.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.