Quickstart: kernel-regularized least squares with KRLS

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Quickstart: kernel-regularized least squares with KRLS

Jens Hainmueller and Chad Hazlett

July 2026

This vignette is the five-minute version: how to fit, inspect, and plot a kernel-regularized least squares (KRLS) regression. For the methodological background and pedagogy, see the explainer on the project site.

Why KRLS?

KRLS estimates a flexible regression surface without assuming linearity, additivity, or pre-specified interactions, and it returns pointwise marginal effects — one partial derivative per observation — with closed-form analytical standard errors. That’s the unique contribution: where OLS gives you one slope per predictor, KRLS gives you a distribution of slopes across the covariate space.

library(KRLS)

A simulated example

Build a dataset where the marginal effect of x1 varies across the covariate space. OLS will average it down to a single slope and miss the heterogeneity entirely.

set.seed(1)
n  <- 200
x1 <- runif(n, -2, 2)
x2 <- runif(n, -2, 2)
# True surface: nonlinear in x1, modulated by x2.
y  <- sin(x1) + 0.5 * x2 * (x1 > 0) + rnorm(n, sd = 0.3)
df <- data.frame(y = y, x1 = x1, x2 = x2)

Fit

The formula interface is new in KRLS 1.2-0:

fit <- krls(y ~ x1 + x2, data = df, print.level = 0)

The matrix interface — krls(X = cbind(x1, x2), y = y) — works exactly the same way and is preserved unchanged from earlier releases.

Tidy output

library(generics)
#> 
#> Attaching package: 'generics'
#> The following objects are masked from 'package:base':
#> 
#>     as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
#>     setequal, union
tidy(fit)
#>   term  estimate  std.error statistic      p.value  conf.low conf.high
#> 1   x1 0.5184107 0.02599911 19.939556 3.091001e-49 0.4671400 0.5696814
#> 2   x2 0.2611639 0.02853130  9.153593 6.901213e-17 0.2048997 0.3174282
#>          q25    median       q75 binary
#> 1 0.19724353 0.4375873 0.7898796  FALSE
#> 2 0.07743713 0.2845457 0.4579881  FALSE
glance(fit)
#>   nobs n_predictors r.squared  loo_mse    lambda   sigma   eff_df approx
#> 1  200            2 0.8755594 657.6741 0.4049892 2.45606 14.25015   none
#>   nystrom_m inference
#> 1        NA     exact

tidy() returns one row per predictor with the average marginal effect (estimate), its analytical standard error, a 95% confidence interval, and quartiles of the pointwise derivatives.

That last block is the headline diagnostic. The AME of x1 averages \(\sin'(x_1) = \cos(x_1)\) across a symmetric interval and so is close to zero. But the pointwise marginal effects range from roughly −1 to +1 — exactly the heterogeneity the simulation built in. The quartiles tell that story; OLS would not.

glance() is a one-row summary of the fit: nobs, n_predictors, r.squared, leave-one-out MSE, the regularization parameter lambda, the kernel bandwidth sigma, and the effective degrees of freedom from the kernel ridge.

Visualize

autoplot() shows the per-predictor distribution of pointwise marginal effects with the AME overlaid in blue:

library(ggplot2)
autoplot(fit)

The base-graphics plot(fit) is the same idea, with optional “profile” plots that hold one predictor fixed and vary another.

Augment for downstream analysis

augment() joins fitted values, residuals, and pointwise marginal-effect columns back to the original data — useful when you want to feed the results into another model or visualization:

aug <- augment(fit, data = df)
head(aug, 3)
#>             y         x1          x2    .fitted    .resid  .dy_d_x1   .dy_d_x2
#> 1 -0.68353586 -0.9379653 -0.92996717 -0.8343403 0.1508045 0.5742612 0.05434422
#> 2  0.01717233 -0.5115044 -1.12541886 -0.5805310 0.5977033 0.6435138 0.11918619
#> 3  0.79687657  0.2914135  0.06718735  0.4516485 0.3452281 0.9391226 0.35127167

The columns .dy_d_x1 and .dy_d_x2 carry the pointwise derivatives, one row per observation, ready for dplyr / ggplot2 / data.table workflows.

When KRLS is the right tool

You want a flexible regression surface but still need pointwise and average marginal effects for substantive interpretation.
Sample size is moderate (\(n \lesssim 5{,}000\)); KRLS solves an \(n \times n\) kernel system, so memory and time scale as \(O(n^3)\). For very large problems, see the explainer for alternatives.
You can standardize predictors so the Gaussian kernel’s bandwidth is sensible across them.

References

Hainmueller, J., and Hazlett, C. (2014). Kernel regularized least squares: Reducing misspecification bias with a flexible and interpretable machine learning approach. Political Analysis, 22(2), 143-168. doi:10.1093/pan/mpt019
Ferwerda, J., Hainmueller, J., and Hazlett, C. (2017). Kernel-Based Regularized Least Squares in R (KRLS) and Stata. Journal of Statistical Software, 79(3), 1-26.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.

Quickstart: kernel-regularized least squares with `KRLS`