Quick start

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Quick start

Jorge Cabral

Introduction

In this vignette, the theoretical considerations are skipped and simple examples of several GCEstim functions are given.

lmgce()

Consider dataGCE (see “Generalized Maximum Entropy framework”).

#> Loading required package: zoo
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric

coef.dataGCE <- c(1, 0, 0, 3, 6, 9)

#>       X001                X002               X003                X004          
#>  Min.   :-0.367479   Min.   :-0.34228   Min.   :-0.226104   Min.   :-0.232254  
#>  1st Qu.:-0.124128   1st Qu.:-0.06117   1st Qu.:-0.065646   1st Qu.:-0.057236  
#>  Median :-0.007875   Median : 0.02085   Median :-0.013335   Median :-0.011069  
#>  Mean   :-0.004117   Mean   : 0.01087   Mean   :-0.003149   Mean   :-0.009777  
#>  3rd Qu.: 0.112612   3rd Qu.: 0.09638   3rd Qu.: 0.074539   3rd Qu.: 0.039632  
#>  Max.   : 0.357509   Max.   : 0.26828   Max.   : 0.255017   Max.   : 0.168009  
#>       X005                 y          
#>  Min.   :-0.240180   Min.   :-2.0085  
#>  1st Qu.:-0.074220   1st Qu.: 0.2753  
#>  Median : 0.009687   Median : 1.0415  
#>  Mean   : 0.002912   Mean   : 0.9871  
#>  3rd Qu.: 0.074166   3rd Qu.: 1.5678  
#>  Max.   : 0.220218   Max.   : 3.4560

Suppose we want to obtain the estimated model

\[\begin{equation} \widehat{\mathbf{y}}=\widehat{\beta_0} + \widehat{\beta_1}\mathbf{X001} + \widehat{\beta_2}\mathbf{X002} + \widehat{\beta_3}\mathbf{X003} + \widehat{\beta_4}\mathbf{X004} + \widehat{\beta_5}\mathbf{X005}. \end{equation}\]

Using the function lmgce() and setting:

formula = y ~ X001 + X002 + X003 + X004 + X005
data = dataGCE

the desired model is computed.

res.lmgce.v01 <-
  lmgce(
    formula = y ~ X001 + X002 + X003 + X004 + X005,
    data = dataGCE)
#> Warning in lmgce(formula = y ~ X001 + X002 + X003 + X004 + X005, data = dataGCE): 
#> 
#> The minimum error was found for the highest upper limit of the support. Confirm if higher values should be tested.

If we would like to see a concise summary of the result we can simply type,

res.lmgce.v01
#> 
#> Call:
#> lmgce(formula = y ~ X001 + X002 + X003 + X004 + X005, data = dataGCE)
#> 
#> Coefficients:
#> (Intercept)         X001         X002         X003         X004         X005  
#>      1.0149      -0.2202       2.5209       3.7025       9.2585      12.3076

To obtain a more detailed evaluation of the fitted model, we can use summary()

summary(res.lmgce.v01)
#> 
#> Call:
#> lmgce(formula = y ~ X001 + X002 + X003 + X004 + X005, data = dataGCE)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -1.14311 -0.28521  0.00022  0.33284  0.94757 
#> 
#> Coefficients:
#>             Estimate Std. Deviation z value Pr(>|t|)    
#> (Intercept)  1.01494        0.02058  49.312  < 2e-16 ***
#> X001        -0.22020        0.26574  -0.829  0.40732    
#> X002         2.52086        2.51212   1.003  0.31563    
#> X003         3.70245        1.39718   2.650  0.00805 ** 
#> X004         9.25849        3.13278   2.955  0.00312 ** 
#> X005        12.30761        2.99930   4.103 4.07e-05 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Normalized Entropy:
#>              NormEnt  SupportLL SupportUL
#> (Intercept) 0.613120  -1.385564  1.385564
#> X001        0.999981 -40.044509 40.044509
#> X002        0.998628 -53.653819 53.653819
#> X003        0.997924 -64.082185 64.082185
#> X004        0.992078 -82.107203 82.107203
#> X005        0.977198 -64.503624 64.503624
#> 
#> Residual standard error: 0.4193 on 94 degrees of freedom
#> Chosen Upper Limit for Standardized Supports: 6.623, Chosen Error: 1se
#> Multiple R-squared: 0.8289, Adjusted R-squared: 0.8198
#> NormEnt: 0.9298, CV-NormEnt: 0.9314 (0.001405)
#> RMSE: 0.4065, CV-RMSE: 0.4245 (0.07841)

The estimated GCE coefficients are $\widehat{\boldsymbol{\beta}}^{GCE}=$ (1.015, -0.22, 2.521, 3.702, 9.258, 12.308).

(coef.res.lmgce.v01 <- coef(res.lmgce.v01))
#> (Intercept)        X001        X002        X003        X004        X005 
#>   1.0149399  -0.2201957   2.5208625   3.7024518   9.2584894  12.3076121

The prediction root mean squared error (RMSE) is $RMSE_{\mathbf{\hat y}}^{GCE} \approx$ 0.407.

res.lmgce.v01$error.measure
#> [1] 0.406515

The prediction 5-fold cross-validation (CV) error is $CV\text{-}RMSE_{\mathbf{\hat y}}^{GCE} \approx$ 0.424.

res.lmgce.v01$error.measure.cv.mean
#> [1] 0.4244885

Z-score confidence intervals (CIs) are obtained with the confint S3 method for class lmgce

confint(res.lmgce.v01, level = 0.95)
#>                   2.5%      97.5%
#> (Intercept)  0.9745995  1.0552803
#> X001        -0.7410327  0.3006412
#> X002        -2.4028096  7.4445345
#> X003         0.9640248  6.4408787
#> X004         3.1183608 15.3986181
#> X005         6.4290868 18.1861374

and can be visualized with plot

plot(res.lmgce.v01, which = 1)[[1]]

Color-filled dots represent the GCE estimates while non-filled dots represent the OLS estimates (lmgce uses by default OLS = TRUE).

Bootstrap CIs are available by setting method = "percentile" or method = "basic", defining a number of replicates boot.B = 1000 and bootstrap method (three methods available, see help("confint.lmgce"))

res.lmgce.v01.confint <-
  confint(
    res.lmgce.v01,
    level = 0.95,
    method = "percentile",
    boot.B = 1000,
    boot.method = "residuals"
  )

res.lmgce.v01.confint
#>                   2.5%     97.5%
#> (Intercept)  0.9466904 1.0613189
#> X001        -0.4420402 0.7366755
#> X002        -3.2157042 0.2189214
#> X003         0.2355981 2.5550294
#> X004         1.8239425 6.3393070
#> X005         5.2718289 9.3762228

res.lmgce.v02 <- 
  update(res.lmgce.v01,
         boot.B = 1000,
         boot.method = "residuals")
#> Warning in lmgce(formula = y ~ X001 + X002 + X003 + X004 + X005, data = dataGCE, : 
#> 
#> The minimum error was found for the highest upper limit of the support. Confirm if higher values should be tested.

res.lmgce.v02.confint <-
  confint(
    res.lmgce.v02,
    level = 0.95,
    method = "percentile"
  )

res.lmgce.v02.confint
#>                   2.5%      97.5%
#> (Intercept)  0.9441818  1.0773997
#> X001        -0.7361146  0.5095636
#> X002        -2.1082789  4.1808036
#> X003         0.8410829  4.7016152
#> X004         3.4094424 11.1024973
#> X005         6.6052740 13.9711447

plot(res.lmgce.v02, which = 1, ci.method = "percentile")[[1]]

From the previous summary we can see that the standardized upper limit for all support spaces was 6.623

res.lmgce.v01$support.stdUL #standardized
#> [1] 6.623

and corresponds to the support spaces which cross-validation RMSE (CV-RMSE) is not greater than the minimum CV-RMSE plus one standard error (1se)

res.lmgce.v01$support.signal.1se
#> [1] 6.623

During estimation, standardized limits are reverted to the original scale

res.lmgce.v01$support.matrix #original scale
#>              SupportLL SupportUL
#> (Intercept)  -1.385564  1.385564
#> X001        -40.044509 40.044509
#> X002        -53.653819 53.653819
#> X003        -64.082185 64.082185
#> X004        -82.107203 82.107203
#> X005        -64.503624 64.503624

lmgce uses by default support.signal = NULL and support.signal.vector = NULL meaning that it will evaluate the errormeasure = "RMSE" with cross-validation (cv = TRUE, cv.nfolds = 5) in a set of support.signal.vector.n = 20 supports spaces logarithmically between support.signal.vector.min = 0.3 and support.signal.vector.min = 20. It will then choose the minimum, one standard error or elbow error (see help("lmgce")).

plot(res.lmgce.v01, which = 2)[[1]]

Red dots represent the CV-error and green dots the CV-Normalized entropy. Whiskers have the length of two standard errors for each of the 20 support spaces. The dotted horizontal line is the OLS CV-error. The black vertical dotted line corresponds to the support spaces that produced the lowest CV-error. The black vertical dashed line corresponds to the support spaces that produced the 1se CV-error. The red vertical dotted line corresponds to the support spaces that produced the elbow CV-error.

With plot it is possible to obtain the trace of the estimates

plot(res.lmgce.v01, which = 3)[[1]]

In the last two plots are depicted the final solutions. That is to say that after choosing the support spaces limits based on the defined error, the number of points of the support spaces and their probability support.signal.points = c(1/5, 1/5, 1/5, 1/5, 1/5), twosteps.n = 1 extra estimation(s) is(are) performed. This estimation uses the GCE framework even if the previous steps were by default on the GME framework. The distribution of probabilities used is the one estimated for the chosen support spaces.

res.lmgce.v01$p0
#>                   p_1        p_2        p_3       p_4       p_5
#> (Intercept) 0.0164583 0.04043601 0.09934627 0.2440815 0.5996780
#> X001        0.1999969 0.19999843 0.20000000 0.2000016 0.2000031
#> X002        0.1989669 0.19948212 0.19999866 0.2005165 0.2010358
#> X003        0.1856222 0.19254787 0.19973189 0.2071840 0.2149140
#> X004        0.1711658 0.18450259 0.19887859 0.2143747 0.2310783
#> X005        0.1457798 0.16891704 0.19572648 0.2267909 0.2627857

The trace of the CV-error can be obtained with plot

plot(res.lmgce.v01, which = 6)[[1]]

The pre reestimation CV-error is depicted by the red dot and final/reestimated CV-error corresponds to the dark red dot. The horizontal dotted line represents again the OLS CV-error. The final estimated vector of probabilities is

res.lmgce.v01$p
#>                    p_1        p_2        p_3       p_4       p_5
#> (Intercept) 0.01096511 0.03033771 0.08393684 0.2322322 0.6425282
#> X001        0.20220556 0.20109672 0.19999395 0.1988972 0.1978065
#> X002        0.18164405 0.19039062 0.19955835 0.2091675 0.2192394
#> X003        0.17754981 0.18812593 0.19933204 0.2112057 0.2237866
#> X004        0.15738899 0.17628610 0.19745213 0.2211595 0.2477133
#> X005        0.13074317 0.15871806 0.19267870 0.2339058 0.2839543

After looking at the results and traces we may feel the need to select a different support, lets say the minimum. That can obviously be done using, for instance

lmgce(y ~ X001 + X002 + X003 + X004 + X005,
      data = dataGCE,
      errormeasure.which = "min")
#or
update(res.lmgce.v01, errormeasure.which = "min)

but that implies a complete reestimation and can be very time-costly. Since the results of all the evaluated support spaces are stored, choosing a different support should be done with changesupport

res.lmgce.v01.min <- changesupport(res.lmgce.v01, "min")

summary(res.lmgce.v01.min)
#> 
#> Call:
#> lmgce(formula = y ~ X001 + X002 + X003 + X004 + X005, data = dataGCE)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -1.14044 -0.32087 -0.00145  0.30294  0.98079 
#> 
#> Coefficients:
#>             Estimate Std. Deviation z value Pr(>|t|)    
#> (Intercept)   1.0214         0.0205  49.819  < 2e-16 ***
#> X001         -0.5732         0.2647  -2.166  0.03035 *  
#> X002          6.5726         2.5023   2.627  0.00862 ** 
#> X003          5.9458         1.3917   4.272 1.94e-05 ***
#> X004         14.3297         3.1206   4.592 4.39e-06 ***
#> X005         17.1517         2.9876   5.741 9.42e-09 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Normalized Entropy:
#>              NormEnt   SupportLL  SupportUL
#> (Intercept) 0.857780   -2.190354   2.190354
#> X001        0.999986 -120.925588 120.925588
#> X002        0.998977 -162.022706 162.022706
#> X003        0.999413 -193.514073 193.514073
#> X004        0.997923 -247.945652 247.945652
#> X005        0.995174 -194.786726 194.786726
#> 
#> Residual standard error: 0.4176 on 94 degrees of freedom
#> Chosen Upper Limit for Standardized Supports: 20, Chosen Error: min
#> Multiple R-squared: 0.831, Adjusted R-squared: 0.822
#> NormEnt: 0.9749, CV-NormEnt: 0.9748 (0.000247)
#> RMSE: 0.4049, CV-RMSE: 0.428 (0.07535)

plot(res.lmgce.v01.min)
#> $p1

#> 
#> $p2

#> 
#> $p3

#> 
#> $p4

#> 
#> $p6

Note that the GCE estimates are now “closer” to the OLS estimates.
In this example the true coefficients are known and we can also observe that the min approach gives estimates “closer” to those coefficients, which in ill-condition problems is not recommended.

data.frame("Supp_1se" = coef(res.lmgce.v01),
           "Supp_min" = coef(res.lmgce.v01.min),
           "OLS" = coef(res.lmgce.v01$results$OLS$res),
           "TRUE" = coef.dataGCE)
#>               Supp_1se   Supp_min        OLS TRUE.
#> (Intercept)  1.0149399  1.0213867  1.0219926     1
#> X001        -0.2201957 -0.5732211 -0.5888888     0
#> X002         2.5208625  6.5726180  6.8461549     0
#> X003         3.7024518  5.9458002  6.0887435     3
#> X004         9.2584894 14.3296693 14.6678589     6
#> X005        12.3076121 17.1516725 17.4690058     9

Several generic functions can used with lmgce class objects, for instance

fitted(res.lmgce.v01)[1:5]
#>          1          2          3          4          5 
#>  0.2347605  1.2600214  1.1204427 -0.1802208  1.2769463

predict(res.lmgce.v01, dataGCE[1,])
#>         1 
#> 0.2347605

lmgceAddin()

An add-in to easily generate the R code for a lmgce analysis when support.method="standardized" and to perform that analysis can be accessed with

lmgceAddin()

Data can be imported from a file or be called from the Environment.

The parameters of lmgce() can be changed.

Several outputs can be visualized.

The expression for lmgce() can be exported.

cv.lmgce()

lmgce() allows for the evaluation of several support spaces and, given a certain criterion, selects one of them. But GCE estimation is also sensitive to the number of points of the support and noise spaces and to the weight that each of the spaces has in the loss function. cv.lmgce() tests the combination of these parameters.

res.cv.lmgce <-
  cv.lmgce(
    y ~ X001 + X002 + X003 + X004 + X005,
    data = dataGCE,
    support.signal.points = c(3, 5, 7, 9, 11),
    support.noise.points = c(3, 5, 7, 9, 11),
    weight = c(0.1, 0.3, 0.5, 0.7, 0.9))

res.cv.lmgce
#> 
#> Call:
#> cv.lmgce(formula = y ~ X001 + X002 + X003 + X004 + X005, data = dataGCE, 
#>     support.signal.points = c(3, 5, 7, 9, 11), support.noise.points = c(3, 
#>         5, 7, 9, 11), weight = c(0.1, 0.3, 0.5, 0.7, 0.9))
#> 
#> 
#> Best combination:
#> 5 points for the signal support; 7 points for the noise support; a weight of 0.3 for the noise support.
#> 
#> Coefficients:
#> (Intercept)         X001         X002         X003         X004         X005  
#>      1.0053      -0.1054       1.1877       2.9615       7.5766      10.7126

It returns CV-errors, convergence of the optimization algorithm and time.

res.cv.lmgce$results[order(res.cv.lmgce$results$error.measure.cv.mean),][1:10,-6] 
#>    support.signal.points support.noise.points weight error.measure
#> 37                     5                    7    0.3     0.4079956
#> 48                     7                   11    0.3     0.4081282
#> 75                    11                   11    0.5     0.4084240
#> 69                     9                    9    0.5     0.4083783
#> 53                     7                    3    0.5     0.4083569
#> 70                    11                    9    0.5     0.4085954
#> 43                     7                    9    0.3     0.4083187
#> 64                     9                    7    0.5     0.4086463
#> 36                     3                    7    0.3     0.4083031
#> 49                     9                   11    0.3     0.4085277
#>    error.measure.cv.mean convergence     time
#> 37             0.4342384           0 2.969143
#> 48             0.4343499           0 3.386481
#> 75             0.4344170           0 4.381215
#> 69             0.4344356           0 3.828571
#> 53             0.4344939           0 2.478632
#> 70             0.4345810           0 3.994712
#> 43             0.4346196           0 3.165467
#> 64             0.4346713           0 3.713387
#> 36             0.4346806           0 3.180539
#> 49             0.4347227           0 3.316488

The model with the combination that produces the lowest CV-error is kept.

summary(res.cv.lmgce$best)
#> 
#> Call:
#> cv.lmgce(formula = y ~ X001 + X002 + X003 + X004 + X005, data = dataGCE, 
#>     support.signal.points = c(3, 5, 7, 9, 11), support.noise.points = c(3, 
#>         5, 7, 9, 11), weight = c(0.1, 0.3, 0.5, 0.7, 0.9))
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -1.13620 -0.25946  0.00574  0.34428  0.95410 
#> 
#> Coefficients:
#>             Estimate Std. Deviation z value Pr(>|t|)    
#> (Intercept)  1.00530        0.01239  81.163  < 2e-16 ***
#> X001        -0.10536        0.15992  -0.659 0.509991    
#> X002         1.18769        1.51177   0.786 0.432084    
#> X003         2.96152        0.84081   3.522 0.000428 ***
#> X004         7.57661        1.88527   4.019 5.85e-05 ***
#> X005        10.71264        1.80495   5.935 2.94e-09 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Normalized Entropy:
#>              NormEnt  SupportLL SupportUL
#> (Intercept) 0.621836  -1.385564  1.385564
#> X001        0.999996 -40.044509 40.044509
#> X002        0.999696 -53.653819 53.653819
#> X003        0.998672 -64.082185 64.082185
#> X004        0.994700 -82.107203 82.107203
#> X005        0.982759 -64.503624 64.503624
#> 
#> Residual standard error: 0.4208 on 94 degrees of freedom
#> Chosen Upper Limit for Standardized Supports: 6.623, Chosen Error: 1se
#> Multiple R-squared: 0.8275, Adjusted R-squared: 0.8183
#> NormEnt: 0.9329, CV-NormEnt: 0.9351 (0.002509)
#> RMSE: 0.408, CV-RMSE: 0.4342 (0.05257)

Using plot we obtain

plot(res.cv.lmgce)

tsbootgce()

Consider the time series in the ts object moz_ts.

plot(moz_ts)

Let:

$CO2_t$ be the emissions of $CO_2$ at time $t$;
$GDP_{t-1}$ be the gross domestic product at time $t-1$;
$EPC_{t-1}$ be the energy per capita at time $t-1$;
$EU_{t-1}$ be the energy use at time $t-1$.

Let us obtain the estimated model

\[\begin{equation} \widehat{CO2_t}=\widehat{\beta_0} + \widehat{\beta_1}GDP_{t-1} + \widehat{\beta_2}EPC_{t-1} + \widehat{\beta_3}EU_{t-1} \end{equation}\]

Using the function tsbootgce() and setting:

formula = CO2 ~ 1 + L(GDP, 1) + L(EPC, 1) + L(EU, 1)
data = moz_ts

the desired model is computed. formula allows additional functions available in R package dynlm).

res.tsbootgce <- 
  tsbootgce(
    formula = CO2 ~ 1 + L(GDP, 1) + L(EPC, 1) + L(EU, 1),
    data = moz_ts
    )

Note that by default tsbootgce uses reps = 1000 replicates as elements of an ensemble, which retain the shape of the original time series, as well as the time dependence structure of the autocorrelation and the partial autocorrelation functions. The plot function can give us percentile confidence intervals (red, green and blue areas) and the median distribution (dashed yellow line) of the generated time series as well as the original data (black line).

plot(res.tsbootgce)[[1]]

A concise summary of the result can be obtained simply by typing,

res.tsbootgce
#> 
#> Call:
#> tsbootgce(formula = CO2 ~ 1 + L(GDP, 1) + L(EPC, 1) + L(EU, 1), 
#>     data = moz_ts)
#> 
#> Coefficients:
#>            (Intercept)  L(GDP, 1)   L(EPC, 1)   L(EU, 1)  
#> tsbootgce   -33.40547      0.03909     0.02290     0.26097
#> lmgce       -70.38222      0.05716     0.03043     0.32684
#> lm         -154.46129      0.13826    -0.02110     0.48219

The tsbootgce coefficients displayed by default are coef.method = "mode".

coef(res.tsbootgce)
#>  (Intercept)    L(GDP, 1)    L(EPC, 1)     L(EU, 1) 
#> -33.40547358   0.03908692   0.02289881   0.26097442

and the $95\%$ highest density regions are

confint(res.tsbootgce)
#>                     2.5%        97.5%
#> (Intercept) -49.98354439 -16.83952784
#> L(GDP, 1)     0.03435963   0.04463217
#> L(EPC, 1)     0.01606767   0.02749484
#> L(EU, 1)      0.21988135   0.30194119

The empirical distribution, confidence intervals and measure of central tendency of each estimate can be visualized by typing

plot(res.tsbootgce,
     ci.levels = c(0.90, 0.95, 0.99),
     ci.method = c("hdr" #,"basic" #,"percentile"
       ))[[2]]

neagging()

The estimates obtained in each bootstrap repetition have a normalized entropy associated that can be used as a weight for an aggregation procedure named neagging.

lmgce object

lmgce objects keep the bootstrap result in object$results$bootstrap when the argument boot.B is greater or equal to 10. In the case of existence of these results we can call the function neagging setting only the object parameter.

res.neagging.lmgce <- neagging(res.lmgce.v02)

The trace of the in sample error can be obtained with plot.

plot(res.neagging.lmgce)

The dotted horizontal line represents the in sample error of the lmgce model. The neagging minimum in sample error, represented by the vertical dashed line, was obtained aggregating 2 models

which.min(res.neagging.lmgce$error)[[1]]
#> [1] 2

The trace of the estimates can also be visualized with plot, where the dotted horizontal lines represent the estimates from the lmgce model.

plot(res.neagging.lmgce, which = 2)

The estimated coefficients that produce the lowest in sample error are $\widehat{\boldsymbol{\beta}}^{neagging}=$ (1.039, -0.159, 0.64, 2.808, 7.236, 10.193).

coef(res.neagging.lmgce)
#> (Intercept)        X001        X002        X003        X004        X005 
#>   1.0391383  -0.1591743   0.6400220   2.8079923   7.2358547  10.1933880

The estimated coefficients, when $1000$ models are aggregated, are $\widehat{\boldsymbol{\beta}}^{neagging}=$ (1.016, -0.265, 2.184, 3.321, 8.112, 10.912).

coef(res.neagging.lmgce, which = ncol(res.neagging.lmgce$matrix))
#> (Intercept)        X001        X002        X003        X004        X005 
#>   1.0156715  -0.2654051   2.1841604   3.3214449   8.1118787  10.9122498

Although the in sample error is higher in the neagging approach, in some scenarios an out of sample improvement can be observed. Consider a data set with the same structure as dataGCE which was generated setting seed = 240863.The out of sample RMSE is smaller using the neagging model

accmeasure(
  as.matrix(cbind(1,dataGCE.test[, - ncol(dataGCE.test)])) %*% as.matrix(coef(res.neagging.lmgce)),
  dataGCE.test$y)
#> [1] 0.574804

accmeasure(
  predict(res.lmgce.v02, dataGCE.test),
  dataGCE.test$y)
#> [1] 0.6931568

tsbootgce object

tsbootgce objects always keep the bootstrap result in object$results$bootstrap so we can call the function neagging setting only the object parameter.

res.neagging.tsbootgce <- neagging(res.tsbootgce)

plot(res.neagging.tsbootgce)

In this case the in sample error is lower in the neagging approach.

Acknowledgements

This work was supported by Fundação para a Ciência e Tecnologia (FCT) through CIDMA and projects https://doi.org/10.54499/UIDB/04106/2020 and https://doi.org/10.54499/UIDP/04106/2020.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.