The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

The MMD copula package: robust estimation of parametric copula models by MMD minimization

library(MMDCopula)
library(VineCopula)
set.seed(1)

Simulation of a synthetic dataset

We simulate 500 points from a Gaussian copula with parameter \(0.5\). From this data, the parameter of the Gaussian copula is estimated, by Canonical Maximum Likelihood Estimation (MLE) and by Maximum Mean Discrepancy Minimization (MMD).


my_data = BiCopSim(N = 500, family = 1, par = 0.5)

estimator_MLE = BiCopEst(u1 = my_data[,1], u2 = my_data[,2],
                         family = 1, method = "mle")

estimator_MMD = BiCopEstMMD(u1 = my_data[,1], u2 = my_data[,2], family = 1)

print(estimator_MLE)
#> Bivariate copula: Gaussian (par = 0.5, tau = 0.33)
print(estimator_MMD)
#> Bivariate copula: Gaussian (par = 0.49, tau = 0.33)

Both estimators are close to the true value, and the MLE estimator is slightly closer.

Adding contamination

We now add a contamination on 20 points of the dataset that are replaced by outliers in the top-left corner of the unit square \([0 , 0.001] \times [0.999, 1]\).


my_data_contam = my_data

number_outliers = 20
q = 0.001
my_data_contam[1:number_outliers, 1] = runif(n = number_outliers, min = 0, max = q)
my_data_contam[1:number_outliers, 2] = runif(n = number_outliers, min = 1-q, max = 1)

estimator_MLE = BiCopEst(u1 = my_data_contam[,1], u2 = my_data_contam[,2],
                         family = 1, method = "mle")

estimator_MMD = BiCopEstMMD(u1 = my_data_contam[,1], u2 = my_data_contam[,2], family = 1)
print(estimator_MLE)
#> Bivariate copula: Gaussian (par = 0.03, tau = 0.02)
print(estimator_MMD)
#> Bivariate copula: Gaussian (par = 0.46, tau = 0.3)

We can see that now the MLE estimator is very far away from the true value while the MMD estimators is still near the true value (even if it is farther away than in the uncontaminated case).

Constructing confidence intervals for the MMD estimator

We now compare the confidence intervals for the estimated parameter in the contaminated and in the non-contaminated case. These intervals are constructed using either the nonparametric bootstrap or using subsampling.


# With the nonparametric bootstrap
CI_bootstrap = BiCopConfIntMMD(x1 = my_data[,1], x2 = my_data[,2],
                              family = 1, level = 0.95)

# With the subsampling
CI_subsampling = BiCopConfIntMMD(x1 = my_data[,1], x2 = my_data[,2],
                              family = 1, level = 0.95, subsamplingSize = 100)

print(CI_bootstrap$CI.Par)
#>       2.5      97.5 
#> 0.4081262 0.5701880
print(CI_subsampling$CI.Par)
#>       2.5      97.5 
#> 0.3526828 0.7005315

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.