DCSmooth

Bastian Schaefer

06.07.2021

1 Introduction

This vignette describes the use of the DCSmooth-package and its functions dcs, set.options, surface.3d, qarma.est and qarma.sim. This package provides some tools for nonparametric estimation of the mean surface \(m\) of an observed sample of some function \[y(x, t) = m(x, t) + \varepsilon(x, t).\] This package is accompanying Schäfer and Feng (2021). Boundary modification procedures for local polynomial smoothing are considered by Feng and Schäfer (2021). For a more comprehensive and detailed description of the algorithms than in section 4, please refer to these papers.

The DCSmooth contains the following functions, methods and data sets:

Functions
set.options() Define options for the dcs()-function.
dcs() Nonparametric estimation of the expectation function of a matrix Y. Includes automatic iterative plug-in bandwidth selection.
surface.dcs() Plot for the surface from an "dcs"-object.
qarma.sim() Simulate a QARMA-model.
qarma.est() Estimate the parameters of a QARMA-model.
Methods/Generics
summary.dcs() Summary statistics for an object of class "dcs".
print.dcs() Print an object of class "dcs".
plot.dcs() Plot method for an "dcs"-object, returns contour plot.
residuals.dcs() Returns the residuals of the regression from an "dcs"-object.
print.summary_dcs() Print an object of class "summary_dcs", which inherits from summary.dcs().
print.set_options() Prints an object of class "dcs_options", which inherits from set.options()
Data
y.norm1 A surface with a single gaussian peak.
y.norm2 A surface with two gaussian peaks.
y.norm3 A surface with two gaussian ridges.

2 Details of Functions

2.1 Functions

set.options

This auxiliary function simplifies the settings of the dcs function. An object of class dcs_options is created and should be used as dcs_options-argument in the dcs function.

Arguments of set.options are

  • type Specifies the regression type. Supported methods are kernel regression ("KR") and local polynomial regression ("LP"), which is the default value.
  • kerns A character vector of length 2 stating the identifiers for the kernels in each dimension to use. The first element corresponds to the smoothing conditional on rows, the second conditional on columns. The identifiers are of the form \(X\_k\mu\nu\), where \(X\) indicates the smoothing method to use, either one of M, MW or T. The value \(k\) is the kernel order, \(\mu\) is the smoothness degree and \(\nu\) the derivative estimated by the kernel. For more information on the kernels see section 4.3, a list of available kernels is given in A.1. The default kernels are "MW_220 for both dimensions.
  • drv Derivative \((\nu_x, \nu_t)\) of \(m(x,t)\) to be estimated. Note that \(k \geq \nu + 2\), hence, only the regression surface (\(\nu = 0\) as default value) and the first and second derivatives can be estimated from the currently available kernels.
  • var_est Specifies the model assumption for the errors/innovations \(\varepsilon(x, t)\) in the regression model. Currently available are iid (independently identically distributed, set as default), qarma (spatial ARMA process) and lm (spatial FARIMA process with long memory). Methods for automated order selection are available for the QARMA estimation via qarma_gpac or qarma_bic. For further information see section 4.4.
  • IPI_options Advanced options for tuning the parameters of the iterative plug-in algorithm of the bandwidth selection. These options include 2-element vectors for the inflation parameters (infl_par), the inflation exponents (infl_exp) and a boundary shrinking parameter for stabilized estimation of the necessary derivatives (delta). Another option to further stabilize estimation of derivatives at the boundaries is the use of a constant estimation window at the boundaries setting the boolean variable const_window to TRUE. The default values for the IPI-options depend partly on the regression type selected and are given below.

set.options returns an object of class "dcs_options including the following values

  • type Inherited from input.
  • kerns Inherited from input.
  • drv Inherited from input.
  • p_order A numeric vector of length 2, computed from drv. It is \(p_k = \nu_k + 1, k = x, t\).
  • var_est Inherited from input.
  • IPI_options Options for the iterative-plug in algorithm for bandwidth selection. If unchanged, values are set conditional on type (see default values for KR and LP below).

Every argument of the set.options function has a default value. Hence, a just using set.options() will produce a complete set of options for double conditional smoothing regression in dcs (which is also implemented as default options in dcs, if the argument dcs_options is omitted).

Default options for kernel regression (type = "KR") are

#> dcs_options 
#> --------------------------------------- 
#> options for DCS   rows    cols 
#> --------------------------------------- 
#> type: kernel regression 
#> kernels used:        MW_220  MW_220
#> derivative:      0   0
#> variance model:  iid
#> ---------------------------------------
#> IPI options:
#> inflation parameters     2   1
#> inflation exponents  0.5 0.5
#> delta            0.05    0.05
#> constant window width    FALSE
#> ---------------------------------------

Default options for local polynomial regression (type = "LP") are

#> dcs_options 
#> --------------------------------------- 
#> options for DCS   rows    cols 
#> --------------------------------------- 
#> type: kernel regression 
#> kernels used:        MW_220  MW_220
#> derivative:      0   0
#> variance model:  iid
#> ---------------------------------------
#> IPI options:
#> inflation parameters     2   1
#> inflation exponents  0.5 0.5
#> delta            0.05    0.05
#> constant window width    FALSE
#> ---------------------------------------

dcs

The dcs-function serves as main function of the package and includes IPI-bandwidth selection and non-parametric smoothing using the selected bandwidths. This function creates an object of class dcs, which includes the results of the DCS procedure.

Arguments of dcs are

  • Y The matrix of observations to be smoothed via the DCS procedure. This matrix should only contain numeric values and no missing observations. For computational reasons, Y has to have at least three rows and columns, however, for reliable results the size should be larger.
  • dcs_options The options used for the smoothing and bandwidth selection. This should be an object of class dcs_options created by set.options. This argument is optional, if omitted, all options will be set to their default values from the set.options function.
  • h Either a two-value vector of bandwidths or "auto" if bandwidth selection should be employed (the default).
  • Further arguments to be passed to the function. This includes the covariates X and T which should be ordered numerical vectors whose length matches the number of rows of Y for X and the number of columns of Y for T. If var_est = "qarma", additional arguments can be set to determine the order of the \(QARMA((p_1, p_2), (q_1, q_2))\) with qarma_order = list(ar = c(1, 1), ma = c(1, 1)). If a order selection procedure is used, the maximum order can be set with order_max = list(ar = c(1, 1), ma = c(1, 1)). Common sense applies when selection these parameters, they should be nonzero integers. Too large orders might lead to (heavily) increased computation time.

dcs returns an object of class "dcs including the following values

  • X, T Vectors of covariates inherited from input.
  • Y Matrix of observations inherited from input.
  • M Matrix of smoothed values. If the argument h = "auto is used in dcs, the bandwidths are optimised via the IPI-algorithm, if h is set to fixed values, these bandwidth are used.
  • R Matrix of residuals computed from \(R = Y - M\).
  • h Bandwidth used for smoothing of Y. Either obtained by IPI or given as argument in dcs.
  • c_f The estimated variance factor used in the last iteration of the bandwidth selection algorithm. Is set to NA’, if no bandwidth selection is used.
  • var_model The model obtained for the innovations \(\varepsilon(x, t)\). Output depends on the model specification used in dcs_options$var_est. For "iid is contains the estimated standard deviation of the residuals and an indicator for stationarity, which is true by assumptions. For "qarma", "qarma_gpac and "qarma_bic" it contains the estimated coefficient matrices $ar and $ma, the standard deviation $sigma as well as an stationarity indicator $stnry. For lm the output is similar to that of QARMA with the addition of the estimated long memory parameter vector d. Is set to NA’, if no bandwidth selection is used.
  • dcs_options An dcs_options object containing the options used in the function.
  • iterations An integer reporting the number of iterations of the IPI algorithm. Is set to NA’, if no bandwidth selection is used.
  • time_used A number reporting the time (in seconds) used for the IPI algorithm (total including all iterations). Is set to NA’, if no bandwidth selection is used.

surface.dcs

This function is a convenient wrapper for the plotly::plot_ly function of the plotly package, for easy display of the considered surfaces. Direct plotting is available for any object of class "dcs" or any numeric matrix.

Arguments of surface.dcs are

  • Y Either an object of class "dcs", inheriting from a call to dcs or a numeric matrix, which is then directly passed to plotly::plot_ly.
  • plot_choice Only used, if "Y" is an object of class "dcs". Specifies the surface to be plotted, 1 for the original observations, 2 for the smoothed surface and 3 for the residual surface. If plot_choice is omitted and Y is an dcs-object, a choice dialogue will be prompted to the console, with the same options.
  • Further arguments to be passed to the plotly::plot_ly function.

surface.dcs returns an object of class "plotly".

qarma.sim

Simulation of a top-left dependent spatial ARMA process (QARMA). This function returns an object of class qarma with attribute "subclass" = "sim". The simulated innovations are created from a normal distribution with specified variance \(\sigma^2\). This function uses a burn-in period for more consistent results.

Arguments of qarma.sim are

  • n_x, n_t The dimension of the resulting matrix of observations, where n_x specifies the number of rows and n_t the number of columns. Initially, a matrix of \(2 n_x \times 2 n_t\) is simulated, of which simulation points with \(i \leq n_x\) or \(j \leq n_t\) are discarded (burn-in period).
  • model A list containing the model parameters to be used in the simulation of the form list(ar, ma, sigma). The values ar and ma are matrices of size \((p_x + 1) \times (p_t + 1)\) respective \((q_x + 1) \times (q_t + 1)\) and containing the coefficients in ascending lag order, so that the upper left entry is equal to 1 (for lag 0 in both dimensions). See the examples in the application part ???. The standard deviation of the iid. innovations with zero mean is sigma, which should be a single positive number.

qarma.sim returns an object of class qarma with attribute "subclass" = "sim" including the following values:

  • Y The matrix of simulated values with size \(n_x \times n_t\) (determined by function arguments n_x, n_t). The matrix \(Y\) is the lower left \(n_x \times n_t\) submatrix of the actually simulated matrix \(Y'\) of size \(2 n_x \times 2 n_t\) to avoid effects from setting the initial values (“burn-in period”).
  • innov The \(n_x \times n_t\) matrix of iid. normally distributed innovations/errors of the QARMA model with zero mean and variance \(\sigma^2\) (determined by the function argument model\$sigma). As with Y, the original matrix has size \(2 n_x \times 2 n_t\).
  • model The model used for simulation, inherited from input.
  • stnry Indicator variable (boolean) for stationarity of the process.

qarma.est

Estimation of a top-left dependent spatial ARMA process (QARMA). This function use a variant of the Hannan-Rissanen algorithm for estimation of the coefficient matrices of a QARMA process of a given order. It returns an object of class qarma with attribute "subclass" = "est".

Arguments of qarma.est are

  • Y A (demeaned) matrix of observations, which contains only numeric values and no missing observations.
  • model_order A list specifying the order of the QARMA to be estimated. This list should be of the form list(ar = c(1, 1), ma = c(1, 1)). Obviously, all orders should be non-negative integers. A QARMA\(((1,1), (1,1))\) model is estimated by default, if model_order is omitted.

qarma.est returns an object of class qarma with attribute "subclass" = "est" including the following values:

  • Y The matrix of observations inherited from input.
  • innov The matrix of estimated innovations from the Hannan-Rissanen algorithm.
  • model A list of estimated model coefficients containing the matrices ar of autoregressive coefficients, the matrix ma of moving average coefficients as well as the standard deviation of residuals sigma.
  • stnry Indicator variable (boolean) for stationarity of the process.

2.2 Methods

The DCSmooth package contains the following methods

Function Methods/Generics available
set.options() print, summary
dcs() plot, print, print.summary, residuals, summary

2.3 Data

There are three simulated example data sets included in the package. Each data set is a matrix of size \(101 \times 101\) computed on \([0,1]^2\) for the following functions:

# surface.dcs(y.norm1)
# surface.dcs(y.norm2)
# surface.dcs(y.norm3)

3 Application

The application of the package is demonstrated at the example of the simulated function values y.norm1 which represent a gaussian peak on \([0,1]^2\) with \(n_x = n_t = 101\) evaluation points. Different models are simulated and estimation using dcs is demonstrated. Whenever default options are used, they are not explicitly used as function arguments, instead only when deviating from the defaults, the options are changed.

Due to file size restrictions, the surface.dcs commands in this vignette are commented out. Run the complete code used in section 3 of this vignette by demo("DCS_demo", package = "DCSmooth").

3.1 Application of the DCS with iid. errors

# surface.dcs(y.norm1)

y_iid = y.norm1 + rnorm(101^2)
# surface.dcs(y_iid)

###* Kernel Regression with iid. errors While local linear regression has some clear advantages over kernel regression, kernel regression is the faster method. Currently, kernel regression is only available for the regression surface ($_x = _t = 0).

opt_iid_KR = set.options(type = "KR")
dcs_iid_KR = dcs(y_iid, opt_iid_KR)

# print results
dcs_iid_KR
#> dcs 
#> -------------------------------------- 
#> DCS with automatic bandwidth selection
#> -------------------------------------- 
#>  Selected Bandwidths:
#>      h_x: 0.18855 
#>      h_t: 0.19259 
#>  Variance Factor:
#>      c_f: 0.99379 
#> --------------------------------------
# print options used for DCS procedure
dcs_iid_KR$dcs_options
#> dcs_options 
#> --------------------------------------- 
#> options for DCS  rows    cols 
#> --------------------------------------- 
#> type: kernel regression 
#> kernels used:        MW_220  MW_220
#> derivative:      0   0
#> variance model:  iid
#> ---------------------------------------

# plot regression surface
# surface.dcs(dcs_iid_KR, plot_choice = 2)

The summary of the "dcs"-object provides some more detailed information:

summary(dcs_iid_KR)
#> summary_dcs 
#> ------------------------------------------ 
#> DCS with automatic bandwidth selection:
#> ------------------------------------------ 
#> Results of kernel regression:
#> Estimated Bandwidths:     h_x:    0.1886 
#>           h_t:    0.1926 
#> Variance Factor:      c_f:    0.9938 
#> Iterations:           4 
#> Time used (seconds):      0.02497 
#> ------------------------------------------ 
#> Variance Estimation:      iid.
#> ------------------------------------------ 
#> See used parameter with "$dcs_options".
#> ------------------------------------------

###* Local Polynomial regression with iid. errors This is the default method, specification of options is not necessary. Note that local polynomial regression requires the bandwidth to cover at least the number of observations of the polynomial order plus one. For small bandwidths or too few observation points in one dimension, local polynomial regression might fail (“Bandwidth h must be larger for local polynomial regression.”). It is suggested to use kernel regression in this case.

dcs_LP_iid = dcs(y_iid)
dcs_LP_iid
#> dcs 
#> -------------------------------------- 
#> DCS with automatic bandwidth selection
#> -------------------------------------- 
#>  Selected Bandwidths:
#>      h_x: 0.17265 
#>      h_t: 0.19012 
#>  Variance Factor:
#>      c_f: 0.99364 
#> --------------------------------------
summary(dcs_LP_iid)
#> summary_dcs 
#> ------------------------------------------ 
#> DCS with automatic bandwidth selection:
#> ------------------------------------------ 
#> Results of local polynomial regression:
#> Estimated Bandwidths:     h_x:    0.1727 
#>           h_t:    0.1901 
#> Variance Factor:      c_f:    0.9936 
#> Iterations:           9 
#> Time used (seconds):      0.08381 
#> ------------------------------------------ 
#> Variance Estimation:      iid.
#> ------------------------------------------ 
#> See used parameter with "$dcs_options".
#> ------------------------------------------

# plot regression surface
# surface.dcs(dcs_LP_iid, plot_choice = 2)

3.2 Application of the DCS with QARMA errors

A matrix containing innovations following a QARMA\(((p_x, p_t), (q_x, q_t))\) process can be obtained by the qarma.sim function. We use the following QARMA\(((1, 1), (1, 1))\)-process as example: \[ \begin{align} AR = \begin{pmatrix} 1 & 0.4 \\ -0.3 & 0.2 \end{pmatrix} \ \text{ and } \ MA = \begin{pmatrix} 1 & 0.2 \\ 0.2 & -0.5 \end{pmatrix} \end{align}\\ \sigma^2 = 0.25 \]

ar_mat = matrix(c(1, 0.4, -0.3, 0.2), nrow = 2, ncol = 2)
ma_mat = matrix(c(1, 0.2, 0.2, -0.5), nrow = 2, ncol = 2)
sigma  = 0.5
model_list = list(ar = ar_mat, ma = ma_mat, sigma = sigma)
sim_qarma = qarma.sim(n_x = 101, n_t = 101, model = model_list)

# QARMA observations
y_qarma = y.norm1 + sim_qarma$Y
# surface.dcs(y_qarma)

Estimation of an QARMA process for a given order works with the qarma.est function (note that the simulated matrix can be accessed via $Y):

estim_qarma = qarma.est(sim_qarma$Y, 
                        model_order = list(ar = c(1, 1), ma = c(1, 1)))

estim_qarma$model
#> $ar
#>           lag 0      lag 1
#> lag 0 1.0000000 -0.3169031
#> lag 1 0.4187407  0.1661860
#> 
#> $ma
#>           lag 0      lag 1
#> lag 0 1.0000000  0.1480133
#> lag 1 0.2161155 -0.5244848
#> 
#> $sigma
#> [1] 0.4964687

###* Local Polynomial regression with specified QARMA order We use the dcs-command with the default QARMA\(((1, 1), (1, 1))\) model (correctly specified) and with a QARMA\(((1, 1), (0, 0))\) model:

# QARMA((1, 1), (1, 1))
opt_qarma_1 = set.options(var_est = "qarma")
dcs_qarma_1 = dcs(y_qarma, opt_qarma_1)
dcs_qarma_1
#> dcs 
#> -------------------------------------- 
#> DCS with automatic bandwidth selection
#> -------------------------------------- 
#>  Selected Bandwidths:
#>      h_x: 0.1223 
#>      h_t: 0.14775 
#>  Variance Factor:
#>      c_f: 0.10348 
#> --------------------------------------
dcs_qarma_1$var_model
#> $ar
#>           lag 0      lag 1
#> lag 0 1.0000000 -0.3163055
#> lag 1 0.4163091  0.1681798
#> 
#> $ma
#>           lag 0      lag 1
#> lag 0 1.0000000  0.1441958
#> lag 1 0.2091644 -0.5297148
#> 
#> $sigma
#> [1] 0.4953101
#> 
#> $stnry
#> [1] TRUE

# QARMA((1, 1), (0, 0))
order_list = list(ar = c(1, 1), ma = c(0, 0))
dcs_qarma_2 = dcs(y_qarma, opt_qarma_1, model_order = order_list)
dcs_qarma_2
#> dcs 
#> -------------------------------------- 
#> DCS with automatic bandwidth selection
#> -------------------------------------- 
#>  Selected Bandwidths:
#>      h_x: 0.13316 
#>      h_t: 0.1605 
#>  Variance Factor:
#>      c_f: 0.15773 
#> --------------------------------------
dcs_qarma_2$var_model
#> $ar
#>           lag 0      lag 1
#> lag 0 1.0000000 -0.2834914
#> lag 1 0.2742504  0.4218404
#> 
#> $ma
#>       lag 0
#> lag 0     1
#> 
#> $sigma
#> [1] 0.561018
#> 
#> $stnry
#> [1] TRUE

3.3 Modelling errors with long memory

This package includes a bandwidth selection algorithm when the errors \(\varepsilon(x, t)\) follow a process with long memory. This process is modeled as a spatial FARIMA (SFARIMA). Estimation under long-memory errors is currently in a experimental state.

opt_lm = set.options(var_est = "lm")
#> Estimation under long-memory errors (SFARIMA) is currently in experimental state.
dcs_lm = dcs(y_iid, opt_lm)

dcs_lm
#> dcs 
#> -------------------------------------- 
#> DCS with automatic bandwidth selection
#> -------------------------------------- 
#>  Selected Bandwidths:
#>      h_x: 0.38821 
#>      h_t: 0.41033 
#>  Variance Factor:
#>      c_f: 0.35761 
#> --------------------------------------
dcs_lm$var_model
#> $d_vec
#> [1] 0.06086290 0.03019642
#> 
#> $ar
#>           [,1]        [,2]
#> [1,] 1.0000000 -0.15841363
#> [2,] 0.3438966 -0.05447791
#> 
#> $ma
#>            [,1]        [,2]
#> [1,]  1.0000000  0.14754294
#> [2,] -0.4116035 -0.06072919
#> 
#> $sigma
#> [1] 1.001689
#> 
#> $stnry
#> [1] TRUE

3.4 Estimation of derivatives

Local polynomial estimation is suitable for estimation of derivatives of a function or a surface. While estimation of derivatives works as well under dependent errors, the example uses the iid. model from 3.1. Derivatives can be computed for any derivative vector drv, if the values are non-negative. Note that the order of the polynomials for the \(\nu\)th derivative is chosen to be \(p_i = \nu_i + 1, i = x, t\). As bandwidths increase with the order of the derivatives, the bandwidth might be large for higher derivative orders.

The estimator for \(m^{(1, 0)}(x, t)\) is given by

opt_drv_1 = set.options(drv = c(1, 0))
dcs_drv_1 = dcs(y_iid, opt_drv_1)

dcs_drv_1
#> dcs 
#> -------------------------------------- 
#> DCS with automatic bandwidth selection
#> -------------------------------------- 
#>  Selected Bandwidths:
#>      h_x: 0.18591 
#>      h_t: 0.25229 
#>  Variance Factor:
#>      c_f: 0.99131 
#> --------------------------------------
# surface.dcs(dcs_drv_1, plot_choice = 2)

The estimator for \(m^{(1, 2)}(x, t)\) is

opt_drv_2 = set.options(drv = c(1, 2))
dcs_drv_2 = dcs(y_iid, opt_drv_2)

dcs_drv_2
#> dcs 
#> -------------------------------------- 
#> DCS with automatic bandwidth selection
#> -------------------------------------- 
#>  Selected Bandwidths:
#>      h_x: 0.32219 
#>      h_t: 0.1687 
#>  Variance Factor:
#>      c_f: 0.97036 
#> --------------------------------------
# surface.dcs(dcs_drv_2, plot_choice = 2)

4 Mathematical Background

4.1 Double Conditional Smoothing

The double conditional smoothing is a spatial smoothing technique which effectively reduces the twodimensional estimation to two one-dimensional estimation procedures. The DCS is defined for kernel regression as well as for local polynomial regression.

Classical bivariate (and multivariate) regression has been considered e.g. by Herrmann et al. (1995) (kernel regression) and Ruppert and Wand (1994) (local polynomial regression). The DCS provides now a faster and, especially for equidistant data, more efficient smoothing scheme, which leads to reduced computation time. For the DCS procedure implemented in this package, consider a \((n_x \times n_t)\)-matrix \(\mathbf{Y}\) of non-empty observations \(u_{i,j}\) and equidistant covariates \(X\), \(T\) on \([0,1]\), where \(X\) has length \(n_x\) and \(T\) has length \(n_t\). The model is then \[ y_{i,j} = m(x_i, t_j) + \varepsilon_{i,j} \] where \(m(x, t)\) is the mean or trend function, \(x_i \in X\), \(t_j \in T\) and \(\varepsilon\) is a random error function with zero mean. The model in matrix form is \(\mathbf{Y} = \mathbf{M}_1 + \mathbf{E}\) at the observation points.

The main assumption of the DCS is that of product kernels, i.e. the weights in the respective methods are constructed by \(K(u,v) = K_1(u) K_2(v)\). Now, a two stage smoother can be constructed by either the kernel weights directly (kernel regression) or by using locally weighted regression with kernels \(K_1\), \(K_2\), in any case, the weights are called \(\mathbf{W}_x\) and \(\mathbf{W}_t\). The DCS procedure implemented in DCSmooth smoothes over rows (conditioning on \(X\)) first and then over columns (conditioning on \(T\)), although switching the smoothing order is exactly equivalent. Hence, the DCS is given by the following equations: \[ \begin{align*} \mathbf{\widehat{M}}_0[ \ , j] &= \mathbf{Y} \cdot \mathbf{W}_t[ \ , j] \\ \mathbf{\widehat{M}}_1[i, \ ] &= \mathbf{W}_x [i, \ ] \cdot \mathbf{\widehat{M}}_0 \end{align*} \]

4.2 Bandwidth Selection

The bandwidth vector \(h = (h_x, h_t)\) is selected via an iterative plug-in (IPI) algorithm (Gasser et al., 1991). The IPI selects the optimal bandwidths by minimizing the mean integrated squared error (MISE) of the estimator. As the MISE includes derivatives of the regression surface \(m(x, t)\), auxiliary bandwidths for estimation of these derivatives are calculated via an inflation method. These inflation method connects the bandwidths of \(m(x, t)\) with that of a derivative \(m^{(\nu_x, \nu_t)}(x, t)\) by \[ \widetilde{h}_k = c_k \cdot h_k^{\alpha}, \quad k = x, t \] and is called exponential inflation method (EIM). The values of \(c_k\) are chosen on simulations, that of \(alpha\) are subject to the derivative of interest. The IPI now starts with an initial bandwidth \(h_0\) (chosen to be \(h_0 = (0.1, 0.1)\)) and calculates in each step \(s\) the auxiliary bandwidths \(\widetilde{h}_{k,s}\) from \(h_{s-1}\) and \(h_s\) from the smoothed derivative surfaces using \(\widetilde{h}_{k,s}\). The iteration process finishes until a certain threshold is reached.

4.3 Boundary Modification

In kernel regression, the boundary problem exists, which leads to biased estimated at the boundaries of the regression surface. This problem can (partially) be solved by means of suitable boundary kernels as introduced by Müller (1991) and Müller and Wang (1994). These boundary kernels differ in their degrees of smoothness and hence lead to different estimation results at the boundaries. However, all kernels are similar to the classical kernels in the interior region of the regression.

Following Feng and Schäfer (2021), a boundary modification is also defined for local polynomial regression. In the DCSmooth package, the local polynomial regression is always with boundary modification weights. Kernel types available (either for kernel regression or local polynomial regression) are Müller-type, Müller-Wang-type and truncated kernels, denoted by M, MW and T. In most applications, the Müller-Wang type are the preferred weighting functions.

4.4 Spatial ARMA estimation

The spatial ARMA process used for modeling dependent errors is a QARMA process, which means, it has a top-left to bottom-right dependency (Martin, 1996). For modeling a spatial ARMA process \(\varepsilon_{i,j}\) the following equations hold: \[ \phi(B_1, B_2)\varepsilon_{i,j} = \psi(B_1, B_2)\eta_{i,j},\\ \] where the lag operators are \(B_1 \varepsilon_{i,j} = \varepsilon_{i-1, j}\) and \(B_2 \varepsilon_{i,j} = \varepsilon_{i,j-1}\), \(\xi \underset{iid.}{\sim} \mathcal{N}(0,\sigma^2)\) and \[ \phi(z_1, z_2) = \sum_{m = 0}^{p_1} \sum_{n = 0}^{p_2} \phi_{m, n} z_1^m z_2^n, \ \psi(z_1, z_2) = \sum_{m = 0}^{q_1} \sum_{n = 0}^{q_2} \psi_{m, n} z_1^m z_2^n. \] The coefficients \(\psi_{m,n}\) and \(\phi_{m,n}\) are written in matrix form \[ \begin{align*} \boldsymbol{\phi} = \begin{pmatrix} \phi_{0,0} & \dots & \phi_{0, p_2} \\ \vdots & \ddots & \\ \phi_{p_1, 0} & & \phi_{p_1, p_2} \end{pmatrix} \ \text{ and } \ \boldsymbol{\psi} = \begin{pmatrix} \psi_{0, 0} & \dots & \psi_{0, q_2} \\ \vdots & \ddots & \\ \psi_{q_1, 0} & & \psi_{q_1, q_2} \end{pmatrix}, \end{align*} \] where \(\Phi\) is the AR-part ($var_model$ar) and \(\Psi\) is the MA-part ($var_model$ma). The example from 3.2, \[ \begin{align} \boldsymbol{\phi} = \begin{pmatrix} 1 & 0.4 \\ -0.3 & 0.2 \end{pmatrix} \ \text{ and } \ \boldsymbol{\psi} = \begin{pmatrix} 1 & 0.2 \\ 0.2 & -0.5 \end{pmatrix} \end{align}, \] would then reduce to the process \[ \varepsilon_{i,j} = 0.4\varepsilon_{i,j-1} - 0.3 \varepsilon_{i-,j} + 0.2\varepsilon_{i-1,j-1} + 0.2 \xi_{i,j-1} + 0.2 \xi_{i-1,j} -0.5 \xi_{i-1,j-1} + \xi_{i,j}. \]

Appendix

A.1 Kernels available in the DCSmooth package:

\(k\) \(\mu\) \(\nu\) Truncated Kernels Müller Kernels Müller-Wang Kernels
2 0 0 MW_200
2 1 0 MW_210
2 2 0 MW_220
3 2 0 MW_320
4 2 0 TR_420 MW_420
4 2 1 MW_421
4 2 2 TR_422 MW_422

References

Feng, Y. and Schäfer, B. (2021) Boundary modification in local polynomial regression and associated boundary kernel functions. Preprint, Paderborn University.
Gasser, T., Kneip, A. and Köhler, W. (1991) A flexible and fast method for automatic smoothing. Journal of the American Statistical Association, 86, 643–652.
Herrmann, E., Engel, J., Gasser, T., et al. (1995) A bandwidth selector for bivariate kernel regression. Journal of the Royal Statistical Society, 57, 171–180.
Martin, R. J. (1996) Some results on unilateral ARMA lattice processes. Journal of Statistical Planning and Inference, 50, 395–411.
Müller, H.-G. (1991) Smooth optimum kernel estimators near endpoints. Biometrika, 78, 521–530.
Müller, H.-G. and Wang, J.-L. (1994) Hazard rate estimation under random censoring with varying kernels and bandwidths. Biometrics, 50, 61–76.
Ruppert, D. and Wand, M. P. (1994) Multivariate locally weighted least squares regression. The Annals of Statistics, 22, 1346–1370.
Schäfer, B. and Feng, Y. (2021) Fast computation and bandwidth selection algorithms for smoothing functional time series. Preprint, Paderborn University.