For application of this estimator, see Vignette “Code - Note 1”.
The MME is described in detail in Section 4.1 in Asenova, Mazo, and Segers (2021). The idea is to find \((\theta_e, e\in E)\) which minimizes the distance between the empirical and the theoretical covariance matrices: \[\begin{equation} \hat{\theta}^{\mathrm{MM}}_{n,k} = \arg\min_{\theta\in(0,\infty)^{E}} \sum_{u\in U} \| \hat{\Sigma}_{W_u, u}-\Sigma_{W_u,u}(\theta) \|_F^2\, . \end{equation}\]
where
\(n\) is the number of all observations in the sample
\(k\) is the number of the upper order statistics used in the estimation
\(u\) is the node for which we condition on the event \(\{X_u>t\}\)
\(\| \cdot \|_F\) is the Frobenius norm
\(U\subseteq V\) is the set of observable variables
\(W_u\) is a subset on the node set depending on \(u\). Typically a neighborhood of \(u\) or the nodes that are flow connected to \(u\) or the intersection of both. Note that the induced graph on \(W_u\) must be connected. A good practice is to compose the sets such that within each subset all parameters are uniquely identifiable. This means that every node in \(W\) with latent variable should be connected to at least three other nodes in the same set \(W\).
\(\hat{\Sigma}_{W_u, u}\) is the non-parametric covariance matrix
\(\Sigma_{W_u,u}(\theta)\) is the parametric covariance matrix
For fixed \(u\) and \(W_u\) the parametric matrix \(\Sigma_{W_u,u}\) is given by \[\begin{equation} \label{eq:hrdist} \big(\Sigma_{W,u}(\Lambda)\big)_{ij} = 2(\lambda_{iu}^2 + \lambda_{ju}^2 - \lambda^2_{ij}), \qquad i,j\in W\setminus u. \end{equation}\] with \[\begin{equation} \big(\Lambda(\theta)\big)_{ij} = \lambda^2_{ij}(\theta) = \frac{1}{4}\sum_{e \in p(i,j)} \theta_e^2\, , \qquad i,j\in V, \ i \ne j, e\in E. \end{equation}\] (See also he parameterization used for block graphs in Vignette “Introduction”.)
If the sample of the original variables is \(\xi_{v,i}, v\in U, i=1,\ldots, n\) consider the transformation using the empirical cumulative distribution function \(\hat{F}_{v,n}(x)=\big[\sum_{i=1}^n\mathbb{1}(\xi_{v,i}\leq x)\big]/(n+1)\).
\[\begin{equation*} \hat{X}_{v,i} = \frac{1}{1-\hat{F}_{v,n}(\xi_{v,i})}, \qquad v \in U, \quad i = 1, \ldots, n. \end{equation*}\]
Fix \(u\) and \(W_u\). For given \(k\in \{1,\ldots n\}\) consider the set of indices \[ I_{u} = \{i = 1,\ldots,n: \hat{X}_{u,i} > n/k\} \]
For every \(v\in W_u\setminus u\) and \(i\in I_u\) compose the differences \[\begin{equation} \Delta_{uv,i} = \ln\hat{X}_{v,i}-\ln\hat{X}_{u,i}. \end{equation}\]
The vector of means of these differences is given by \[\begin{equation*} \hat{\mu}_{W_u,u} = \frac{1}{|I_u|}\sum_{i\in I_u}(\Delta_{uv,i}, v\in W_u \setminus u). \end{equation*}\]
The non-parametric covariance matrix \(\hat{\Sigma}_{W_u,u}\) is given by
\[\begin{equation*} \hat{\Sigma}_{W_u,u} = \frac{1}{|I_u|}\sum_{i\in I_u}(\Delta_{uv,i}-\hat{\mu}_{W_u,u}, v\in W_u\setminus u) (\Delta_{uv,i}-\hat{\mu}_{W_u,u}, v\in W_u\setminus u)^\top\, . %\end{split} \end{equation*}\]
An estimator of this type \(\hat{\mu}\) and \(\hat{\Sigma}\) has been suggested in Engelke et al. (2015).
Asenova, Stefka, Gildas Mazo, and Johan Segers. 2021. “Inference on Extremal Dependence in the Domain of Attraction of a Structured Hüsler–Reiss Distribution Motivated by a Markov Tree with Latent Variables.” Extremes. https://doi.org/10.1007/s10687-021-00407-5.
Engelke, S., A. Malinowski, Z. Kabluchko, and M. Schlather. 2015. “Estimation of Hüsler-Reiss Distributions and Brown-Resnick Processes.” Journal of the Royal Statistical Society. B 77: 239–65.