Double RKHS PLS (rkhs_xy): Theory and Usage

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Frédéric Bertrand

2025-11-26

Overview

We implement a double RKHS variant of PLS, where both the input and the output spaces are endowed with reproducing kernels:

\(K_X \in \mathbb{R}^{n\times n}\) with entries \([K_X]_{ij} = k_X(x_i, x_j)\),
\(K_Y \in \mathbb{R}^{n\times n}\) with entries \([K_Y]_{ij} = k_Y(y_i, y_j)\).

We use centered Grams \(\tilde K_X = H K_X H\) and \(\tilde K_Y = H K_Y H\), where \(H = I - \frac{1}{n}\mathbf{1}\mathbf{1}^\top\).

Operator and Latent Directions

Following the spirit of Kernel PLS Regression II (IEEE TNNLS, 2019), we avoid explicit square roots and form the SPD surrogate operator \[ \mathcal{M} \, v = (K_X+\lambda_x I)^{-1} \; K_X \; K_Y \; K_X \; (K_X+\lambda_x I)^{-1} \, v, \] with small ridge \(\lambda_x > 0\) for stability. We compute the first \(A\) orthonormal latent directions \(T = [t_1,\dots,t_A]\) via power iteration with Gram–Schmidt orthogonalization on \(\mathcal{M}\).

We then solve a small regression in the latent space: \[ C = (T^\top T)^{-1} (T^\top \tilde Y), \qquad \tilde Y = Y - \mathbf{1} \bar y^\top, \] and form dual coefficients \[ \alpha \;=\; U \, C, \qquad U \;=\; (K_X+\lambda_x I)^{-1} T, \] so that training predictions satisfy \[ \hat Y \;=\; \tilde K_X \, \alpha + \mathbf{1}\,\bar y^\top . \]

Centering for Prediction

Given new inputs \(X_\*\), define the cross-Gram \[ K_\* = K(X_\*, X) . \] To apply training centering to \(K_\*\), use \[ \tilde K_\* \;=\; K_\* \;-\; \mathbf{1}\, \bar k_X^\top \;-\; \bar k_\* \mathbf{1}^\top \;+\; \mu_X, \] where: - \(\bar k_X = \frac{1}{n}\mathbf{1}^\top K_X\) is the column mean vector for the (uncentered) training Gram, - \(\mu_X = \frac{1}{n^2} \mathbf{1}^\top K_X \mathbf{1}\) is its grand mean, - \(\bar k_\*\) is the row mean of \(K_\*\) (computed at prediction time).

Predictions then follow the familiar dual form: \[ \hat Y_\* \;=\; \tilde K_\* \, \alpha + \mathbf{1}_\* \, \bar y^\top . \]

Practical Notes

Choose \(k_X\) (e.g., RBF) to reflect nonlinear structure in inputs. A linear \(k_Y\) already produces numeric outputs in \(\mathbb{R}^m\).
The ridge terms \(\lambda_x, \lambda_y\) stabilize inversions and dampen numerical noise.
With algorithm = "rkhs_xy", the package returns:
- dual_coef \(=\alpha\),
- scores \(=T\) (approximately orthonormal),
- intercept \(=\bar y\),
- and uses the centered cross-kernel formula above in predict().

Minimal Example

library(bigPLSR)
set.seed(42)
n <- 60; p <- 6; m <- 2
X <- matrix(rnorm(n * p), n, p)
Y <- cbind(sin(X[,1]) + 0.4 * X[,2]^2,
           cos(X[,3]) - 0.3 * X[,4]^2) + matrix(rnorm(n*m, sd=.05), n, m)

op <- options(
  bigPLSR.rkhs_xy.kernel_x = "rbf",
  bigPLSR.rkhs_xy.gamma_x  = 0.5,
  bigPLSR.rkhs_xy.kernel_y = "linear",
  bigPLSR.rkhs_xy.lambda_x = 1e-6,
  bigPLSR.rkhs_xy.lambda_y = 1e-6
)
on.exit(options(op), add = TRUE)

fit <- pls_fit(X, Y, ncomp = 3, algorithm = "rkhs_xy", backend = "arma")
Yhat <- predict(fit, X)
mean((Y - Yhat)^2)
#> [1] 2.619847e-12

References • Rosipal & Trejo (2001) Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space. JMLR 2:97–123. doi:10.5555/944733.944741. • Kernel PLS Regression II: Kernel Partial Least Squares Regression by Projecting Both Independent and Dependent Variables into Reproducing Kernel Hilbert Space. IEEE TNNLS (2019). doi:10.1109/TNNLS.2019.2932014.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.