The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The ksm package provides functionalities for the kernel
density estimation for random symmetric positive definite matrices. It
provides C++ wrappers for estimating the density, with additional
functionalities for estimating the optimal bandwidth according to the
least square cross validation (LSCV) and the leave-one-out cross (LCV)
validation criteria. Three kernels are implemented: the Gaussian
(smnorm), log-Gaussian (smlnorm) and Wishart
(Wishart).
set.seed(2025)
library(ksm)
#>
#> Attaching package: 'ksm'
#> The following object is masked from 'package:stats':
#>
#> rWishart
d <- 4L
# Equicorrelation matrix
S <- diag(rep(0.5, d)) + matrix(0.5, d, d)
# Generate simulated data from an inverse Wishart (d by d by n array).
samp <- rinvWishart(n = 100, df = 10, S)
dim(samp)
#> [1] 4 4 100
# Optimize the bandwidth for kernel using leave-one-out cross validation
b <- bandwidth_optim(
x = samp,
kernel = "smlnorm",
criterion = "lcv")
# Evaluate kernel with a new psd matrix data point
newsamp <- rinvWishart(n = 2, df = 10, S = S)
# Kernel density (default to log scale)
kdens_symmat(
x = newsamp,
xs = samp,
kernel = "smlnorm",
b = b)
#> [1] 15.710544 3.460046We can also consider stationary data and exclude observations at lag \(h\) to account for serial dependence. This is illustrated below with a time series of realized variance covariance matrix between two assets over 250 days.
data(realvar, package = "ksm")
dim(realvar)
#> [1] 2 2 250
# Select suitable lag for least square cross validation
h <- ceiling(dim(realvar)[3]^0.25)
# Calculate optimal bandwidth
bopt <- bandwidth_optim(
x = realvar,
kernel = "Wishart",
criterion = "lscv",
h = h)
# Evaluate at multiple values of bandwidth
bseq <- seq(0.5 * bopt, 2 * bopt, length.out = 101)
lscv <- ksm::lscv_kdens_symmat(
x = realvar,
b = bseq,
h = h,
kernel = "Wishart")
with(lscv,
plot(x = b,
y = lscv,
type = "l",
panel.first = {
abline(v = bopt, lty = 2, col = "grey")
},
xlab = "bandwidth",
ylab = "least square cross validation"
)
)We can also make integration in dimensions \(d \in \{2, 3\}\) to evaluate numerically
the performance of kernels on simulated data. The function
integrate_spd performs a transformation of inputs for
numerical integration with routines from the cubature
package. This is useful for checking correct implementation: below, we
check that the density of a Wishart integrates to unity over the space
of positive definite matrices.
Different methods for integration are provided: we recommend in
particular cuhre, suave or
hcubature. The lb argument controls the
smaller value for the eigenvalues, which sometimes returns matrices that
are close to being non-positive definite. If such an error mistake
arises, consider setting lb to a small value.
# Check that Wishart density integrates to unity
(int <- integrate_spd(
dim = 2L,
neval = 1e5L,
f = function(x, S){
dWishart(x, df = 10, S = S, log = FALSE)},
S = diag(2),lb = 0, method = "hcubature"))
#> $integral
#> [1] 0.9999433
#>
#> $error
#> [1] 0.0009986894
#>
#> $neval
#> [1] 30723
#>
#> $returnCode
#> [1] 0
isTRUE(all.equal(int$integral, 1, tol = 2*int$error))
#> [1] TRUEThe main use for the integrate_spd function is to
calculate the integrated squared error for data simulated from some
density. We do this here for different models that generate independent
and identically distributed matrices from a mixture of Wishart
densities. The function here relies on predefined models from functions
simu_rdens and simu_fdens used in the
simulation studies of the paper for the independent and identically
distributed scenario.
# Integrated squared error
ISE <- function(
S,
x,
bandwidth,
model = 1:6,
kernel = c("Wishart", "smlnorm", "smnorm"),
...
) {
model <- match.arg(as.character(model), choices = 1:6)
kernel <- match.arg(kernel)
dim <- ncol(x)
# Compute the squared difference between the kernel and the target density
(
ksm::kdens_symmat(
x = S,
xs = x,
b = bandwidth,
kernel = kernel,
log = FALSE
) -
ksm::simu_fdens(S, model = model, d = dim)
)^2
}
# Calculate numerical integral
d <- 2L
xs <- simu_rdens(n = 100, model = 2, d = d)
b <- bandwidth_optim(
x = xs,
criterion = "lcv",
kernel = "smnorm")
# Computing ISE
ISE(
S = xs[,,1, drop = FALSE],
x = xs,
bandwidth = b,
model = 2,
kernel = "smnorm")
#> [1] 8.525459e-05
# Next integrate over the space of psd matrices
ISE_int <- try(
ksm::integrate_spd(
f = function(S) {
ISE(
S,
x = xs,
kernel = "smnorm", # (only for lcv)
bandwidth = b,
model = 2
)
},
dim = d,
method = "hcubature",
lb = 0,
ub = Inf,
neval = 1e6L
),
silent = TRUE
)
ISE_int
#> $integral
#> [1] 0.002425163
#>
#> $error
#> [1] 2.42419e-06
#>
#> $neval
#> [1] 66561
#>
#> $returnCode
#> [1] 0These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.