The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
Exact KRLS solves a Tikhonov problem on the full \(n \times n\) Gaussian kernel. Time is \(O(n^3)\) and memory is \(O(n^2)\), so the practical ceiling on a laptop is around \(n \approx 5{,}000\). For larger samples the kernel allocation alone becomes prohibitive.
The Nystrom approximation replaces the full kernel with a low-rank representation anchored at \(m\) landmark points. The fitted model is \(\hat f(x) = K(x, Z)\, \alpha\) where \(Z\) are the landmarks and \(\alpha\) is a length-\(m\) coefficient vector solved in \(m\)-dimensional space. Time becomes \(O(n m^2 + m^3)\) and memory \(O(n m)\), so KRLS can now handle sample sizes well past the exact ceiling.
KRLS exposes this as an explicit approximation:
Default landmark count is min(500, n). Inference (when
vcov = TRUE) is conditional approximate – standard
errors condition on the selected landmarks, fixed \(\lambda\), and the low-rank feature map;
they are not equivalent to exact KRLS inference.
We simulate a smooth function in three predictors plus a binary indicator, and compare exact KRLS to Nystrom across modest \(n\).
make_data <- function(n) {
X <- cbind(rnorm(n), rnorm(n), rnorm(n), rbinom(n, 1, 0.4))
colnames(X) <- c("x1", "x2", "x3", "x4")
f <- sin(X[, 1]) + 0.3 * X[, 2] + 0.05 * X[, 3]^2 + 0.5 * X[, 4]
list(X = X, y = f + rnorm(n, sd = 0.3), truth = f)
}For each \(n\) we fit both paths, record wall-clock time, in-sample \(R^2\), and prediction RMSE on a held-out test set drawn from the same distribution. Predictions are evaluated against the true mean function (no noise on the test target) to focus on approximation error.
sizes <- c(500, 1000)
res <- vector("list", length(sizes))
for (i in seq_along(sizes)) {
n <- sizes[i]
tr <- make_data(n)
te <- make_data(200)
# Use the package default landmark count, min(500, n).
m <- min(500L, n)
t_e <- system.time({
fit_e <- krls(tr$X, tr$y, approx = "none",
derivative = FALSE, vcov = FALSE,
print.level = 0)
})["elapsed"]
t_n <- system.time({
fit_n <- krls(tr$X, tr$y, approx = "nystrom",
derivative = FALSE, print.level = 0)
})["elapsed"]
pred_e <- predict(fit_e, newdata = te$X)$fit
pred_n <- predict(fit_n, newdata = te$X)$fit
res[[i]] <- data.frame(
n = n,
m = m,
time_exact = round(as.numeric(t_e), 2),
time_nys = round(as.numeric(t_n), 3),
speedup = round(as.numeric(t_e) / as.numeric(t_n), 1),
rmse_exact = round(sqrt(mean((pred_e - te$truth)^2)), 4),
rmse_nys = round(sqrt(mean((pred_n - te$truth)^2)), 4)
)
}
#> Warning: krls: the following column(s) look categorical but were not listed in cat_columns:
#> x4 (2 unique values)
#> KRLS does NOT autodetect categoricals. Pass these in cat_columns to get the sqrt(0.5)-scaled one-hot encoding (matching kbal / GPSS), or pass cat_columns = character(0) / integer(0) to silence this warning.
#> Warning: krls: the following column(s) look categorical but were not listed in cat_columns:
#> x4 (2 unique values)
#> KRLS does NOT autodetect categoricals. Pass these in cat_columns to get the sqrt(0.5)-scaled one-hot encoding (matching kbal / GPSS), or pass cat_columns = character(0) / integer(0) to silence this warning.
#> Warning: krls: the following column(s) look categorical but were not listed in cat_columns:
#> x4 (2 unique values)
#> KRLS does NOT autodetect categoricals. Pass these in cat_columns to get the sqrt(0.5)-scaled one-hot encoding (matching kbal / GPSS), or pass cat_columns = character(0) / integer(0) to silence this warning.
#> Warning: krls: the following column(s) look categorical but were not listed in cat_columns:
#> x4 (2 unique values)
#> KRLS does NOT autodetect categoricals. Pass these in cat_columns to get the sqrt(0.5)-scaled one-hot encoding (matching kbal / GPSS), or pass cat_columns = character(0) / integer(0) to silence this warning.
do.call(rbind, res)
#> n m time_exact time_nys speedup rmse_exact rmse_nys
#> 1 500 500 0.84 0.568 1.5 0.1119 0.1119
#> 2 1000 500 6.26 1.187 5.3 0.0885 0.0885The Nystrom path stays in milliseconds while exact KRLS grows cubically. Prediction RMSE on held-out data is comparable – Nystrom trades a small amount of approximation error for a large amount of runtime.
Past \(n \approx 2{,}000\) the exact path becomes uncomfortable on a typical laptop. Nystrom keeps going:
n <- 2000
tr <- make_data(n)
te <- make_data(200)
# Default is min(500, n) -- here 500 landmarks at n = 2000.
m <- min(500L, n)
t_n <- system.time({
fit_big <- krls(tr$X, tr$y, approx = "nystrom",
derivative = FALSE, print.level = 0)
})["elapsed"]
#> Warning: krls: the following column(s) look categorical but were not listed in cat_columns:
#> x4 (2 unique values)
#> KRLS does NOT autodetect categoricals. Pass these in cat_columns to get the sqrt(0.5)-scaled one-hot encoding (matching kbal / GPSS), or pass cat_columns = character(0) / integer(0) to silence this warning.
pred_big <- predict(fit_big, newdata = te$X)$fit
sprintf("n = %d, m = %d, time = %.3fs, R2 = %.3f, RMSE = %.3f",
n, m, t_n, fit_big$R2,
sqrt(mean((pred_big - te$truth)^2)))
#> [1] "n = 2000, m = 500, time = 2.436s, R2 = 0.862, RMSE = 0.075"The fit still recovers the truth at held-out RMSE comparable to the exact path on the smaller datasets, in a fraction of a second.
When running sensitivity checks – refitting on the same predictors
with several outcome variants, or perturbing \(y\) – it is convenient to fix the landmark
set so the approximation geometry is held constant. The fitted object
stores landmarks in standardized \(X\)-space; to pass them back through
krls() use get_landmarks(), which
un-standardizes to the original units the function expects:
tr <- make_data(400)
fit <- krls(tr$X, tr$y, approx = "nystrom", nystrom_m = 30,
derivative = FALSE, print.level = 0)
#> Warning: krls: the following column(s) look categorical but were not listed in cat_columns:
#> x4 (2 unique values)
#> KRLS does NOT autodetect categoricals. Pass these in cat_columns to get the sqrt(0.5)-scaled one-hot encoding (matching kbal / GPSS), or pass cat_columns = character(0) / integer(0) to silence this warning.
Z <- get_landmarks(fit) # original X-scale matrix
y_perturbed <- tr$y + rnorm(400, sd = 0.05)
fit2 <- krls(tr$X, y_perturbed, approx = "nystrom",
landmarks = Z, derivative = FALSE, print.level = 0)
#> Warning: krls: the following column(s) look categorical but were not listed in cat_columns:
#> x4 (2 unique values)
#> KRLS does NOT autodetect categoricals. Pass these in cat_columns to get the sqrt(0.5)-scaled one-hot encoding (matching kbal / GPSS), or pass cat_columns = character(0) / integer(0) to silence this warning.
# Same landmarks under the hood
identical(unname(fit$landmarks), unname(fit2$landmarks))
#> [1] TRUEThe landmark coordinates are bit-identical, so any difference between the two fits is attributable to the change in \(y\) alone – not to re-selecting anchor points.
KRLS picks the regularization parameter by minimizing a closed-form objective. Two are available:
lambda_method = "loo" (the default) – leave-one-out
squared error.lambda_method = "gcv" – generalized cross-validation,
\(RSS(\lambda) / (1 -
\mathrm{tr}(S(\lambda))/n)^2\).For the Nystrom path both are evaluated in \(O(nm)\) per candidate \(\lambda\) from the cached SVD, so the choice is essentially free.
tr <- make_data(500)
fit_loo <- krls(tr$X, tr$y, approx = "nystrom", nystrom_m = 25,
derivative = FALSE, lambda_method = "loo",
print.level = 0)
#> Warning: krls: the following column(s) look categorical but were not listed in cat_columns:
#> x4 (2 unique values)
#> KRLS does NOT autodetect categoricals. Pass these in cat_columns to get the sqrt(0.5)-scaled one-hot encoding (matching kbal / GPSS), or pass cat_columns = character(0) / integer(0) to silence this warning.
fit_gcv <- krls(tr$X, tr$y, approx = "nystrom", nystrom_m = 25,
derivative = FALSE, lambda_method = "gcv",
print.level = 0)
#> Warning: krls: the following column(s) look categorical but were not listed in cat_columns:
#> x4 (2 unique values)
#> KRLS does NOT autodetect categoricals. Pass these in cat_columns to get the sqrt(0.5)-scaled one-hot encoding (matching kbal / GPSS), or pass cat_columns = character(0) / integer(0) to silence this warning.
data.frame(
method = c("loo", "gcv"),
lambda = c(fit_loo$lambda, fit_gcv$lambda),
R2 = c(fit_loo$R2, fit_gcv$R2)
)
#> method lambda R2
#> 1 loo 0.1983744 0.8341407
#> 2 gcv 0.3202462 0.8134584LOO and GCV typically pick lambdas within a small multiplicative factor of each other. GCV is sometimes preferred when one or two observations have disproportionate leverage; LOO has more historical use in KRLS applications.
A Nystrom fit is still a krls object, so the standard
accessors work:
predict(fit, newdata = ...) – predictions; pass
se.fit = TRUE for conditional approximate standard errors
when the fit was built with vcov = TRUE.summary(fit) – now prints an Approximation
block reporting m, the landmark method, inference type, the
landmark-kernel condition number, and the count of eigenvalues at the
relative-ridge floor.tidy(fit), glance(fit),
augment(fit) – broom methods are Nystrom-aware.
glance() reports approx,
nystrom_m, inference, and the effective
degrees of freedom.approx = "nystrom".kmeans landmark selection is approximately reproducible
under set.seed() but not bit-stable across R versions; for
strict reproducibility use the default random sampling or supply
landmarks explicitly.These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.