The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

GWR and Mixed GWR with spatial autocorrelation

Ghislain Geniaux

2025-02-18

Introduction

mgwrsar package estimates linear and local linear models with spatial autocorrelation of the following forms:

\[y=\beta_v(u_i,v_i) X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (GWR)\]

\[y=\beta_c X_c+\beta_v(u_i,v_i) X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR)\]

\[y=\lambda Wy+\beta_c X_c+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(0,k,0))\]

\[y=\lambda Wy+\beta_v(u_i,v_i)X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(0,0,k))\]

\[y=\lambda Wy+\beta_c X_c+\beta_v(u_i,v_i)X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(0,k_c,k_v))\]

\[y=\lambda(u_i,v_i) Wy+\beta_c X_c+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(1,k,0))\]

\[y=\lambda(u_i,v_i)Wy+\beta_v(u_i,v_i)X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(1,0,k))\]

\[y=\lambda(u_i,v_i)Wy+\beta_cX_c+\beta_v(u_i,v_i)X_v+\,\epsilon_i\;\;\;\;\;\;\;\; (MGWR-SAR(1,k_c,k_v))\]

For a deeper understanding of the estimation method, please refer to Geniaux, G. and Martinetti, D. (2017). A new method for dealing simultaneously with spatial autocorrelation and spatial heterogeneity in regression models. Regional Science and Urban Economics. (https://doi.org/10.1016/j.regsciurbeco.2017.04.001)

The estimation of the aforementioned model can be performed using the MGWRSAR wrapper function, which necessitates a dataframe and a matrix of coordinates. It’s important to note:

In addition to enabling the consideration of spatial autocorrelation in GWR/MGWR-like models, the MGWRSAR function introduces several useful techniques for estimating local regression with space and space-time coordinates:

Please note:

Estimation of GWR/MGWR with spatial dependance.

The MGWRSAR function requires a dataframe and a matrix of coordinates, and eventually a spatial weight matrix if model include spatiale dependence. The package includes a simulated data example which is used for this vignette.

library(mgwrsar)
#> Loading required package: Rcpp
#> Loading required package: sp
#> Loading required package: leaflet
#> Loading required package: Matrix
## loading data example
data(mydata)
coord=as.matrix(mydata[,c("x","y")])
head(coord)
#>             x       y
#> [1,] 872142.8 6234075
#> [2,] 872894.3 6233366
#> [3,] 878615.1 6229884
#> [4,] 869936.0 6228408
#> [5,] 866441.8 6231471
#> [6,] 866952.9 6222411

head(mydata)
#>   Y_mgwrsar_1_0_kv X0       X1        X2        X3       Beta0      Beta1
#> 1       0.44146014  1 2.396274 0.4763857 0.6854775  0.11599551 0.26922206
#> 2       0.02223952  1 1.684776 0.3303000 0.8994048  0.11533789 0.25784575
#> 3       3.41607993  1 2.369278 1.5117796 0.5808293  0.20973774 0.55723064
#> 4       0.28497090  1 4.208743 1.0916590 1.4058201 -0.08706293 0.06795084
#> 5      -0.79417336  1 3.109543 0.6487661 1.6977387 -0.08209832 0.21763086
#> 6       6.78538775  1 7.284831 0.8828887 1.5085908 -0.28330886 0.52468859
#>       Beta2      Beta3    lambda     Y_ols    Y_sar       Y_gwr      Y_mgwr
#> 1 0.6772792 -1.1106223 0.2408643 0.9890454 2.010387  0.32246490  0.39829402
#> 2 0.6751576 -1.0326101 0.2273072 0.2732833 1.539485 -0.15597959 -0.12664995
#> 3 0.5623073 -0.5412975 0.3384363 2.1155894 3.637831  2.06565538  1.79922754
#> 4 1.0796267 -0.9258913 0.3543732 1.7902102 3.201578  0.07587226 -0.02831127
#> 5 1.1019372 -1.2759568 0.3772657 0.5057989 1.460611 -0.85670748 -0.38820500
#> 6 1.5488474 -0.7649646 0.4933481 3.0167134 4.654380  3.75240014  3.39782791
#>   Y_mgwrsar_0_0_kv Y_mgwrsar_0_kc_kv Y_mgwrsar_1_kc_kv Y_mgwr_outlier X2dummy
#> 1        0.5673431         0.7137761        0.55150700      0.3982940   FALSE
#> 2        0.2031676         0.2611103        0.06388119     -0.1266500    TRUE
#> 3        3.8183989         3.1882294        2.86908705      1.7992275    TRUE
#> 4        0.3314998         0.1832260        0.14519215     -0.0188881    TRUE
#> 5       -0.7858941        -0.1114446       -0.13694311     -0.3882050   FALSE
#> 6        5.8227160         5.2859121        6.16333174      3.3978279    TRUE
#>   Y_mgwr_X2dummy        x       y
#> 1     0.07564794 872142.8 6234075
#> 2    -0.34965450 872894.3 6233366
#> 3     0.94914284 878615.1 6229884
#> 4    -1.20689547 869936.0 6228408
#> 5    -1.10310449 866441.8 6231471
#> 6     2.03036799 866952.9 6222411
## plot
ggplot(mydata,aes(x,y))+geom_point(aes(color=Beta0))+ ggtitle('Spatial Pattern of coefficient Beta0')+scale_colour_gradientn(colours = terrain.colors(10))

ggplot(mydata,aes(x,y))+geom_point(aes(color=Beta1))+ ggtitle('Spatial Pattern of coefficient Beta1')+scale_colour_gradientn(colours = terrain.colors(10))

ggplot(mydata,aes(x,y))+geom_point(aes(color=lambda))+ ggtitle('Spatial Pattern of spatial autocorrelation coefficient (lambda)')+scale_colour_gradientn(colours = terrain.colors(10))


W=kernel_matW(H=4,kernels='rectangle',coords=coord,NN=4,adaptive=TRUE,diagnull=TRUE)

Note that in this package there are two ways to express a bandwidth \(H\). It can be a distance (adaptive = F) or a number of neighbors (adaptive = T).

Estimation

Estimation of a mgwrsar(1,0,kv) model with all parameter spatially varying using an adapative gaussian kernel (based on 20 nearest neighbors):

mgwrsar_1_0_kv<-MGWRSAR(formula = 'Y_mgwrsar_1_0_kv~X1+X2+X3', data = mydata,coords=coord, fixed_vars=NULL,kernels=c('gauss'),H=20, Model = 'MGWRSAR_1_0_kv',control=list(SE=F,adaptive=T,W=W))
summary(mgwrsar_1_0_kv)
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_1_0_kv~X1+X2+X3", data = mydata, 
#>     coords = coord, fixed_vars = NULL, kernels = c("gauss"), 
#>     H = 20, Model = "MGWRSAR_1_0_kv", control = list(SE = F, 
#>         adaptive = T, W = W))
#> Model: MGWRSAR_1_0_kv 
#> Method for spatial autocorrelation: 2SLS 
#> Kernels function: gauss 
#> Kernels adaptive: YES 
#> Kernels type: GD 
#> Bandwidth: 20 
#> Computation time: 2.137 
#> Use of parallel computing: FALSE  ncore = 1 
#> Use of rough kernel: NO 
#> Use of Target Points: NO 
#> Number of data points: 1000 
#> Varying parameters: Intercept X1 X2 X3  
#>         Intercept       X1       X2       X3  lambda
#> Min.     -5.55249  0.11207 -1.20584 -1.86027 -0.2375
#> 1st Qu.  -1.47283  0.36956  0.64156 -1.24644  0.1831
#> Median   -0.75406  0.54369  1.72877 -0.99875  0.3845
#> Mean     -0.79631  0.50721  1.75803 -1.00349  0.3568
#> 3rd Qu.  -0.15439  0.63505  2.86983 -0.71468  0.5382
#> Max.      2.06110  0.95748  5.14466 -0.27205  0.8808
#> Effective degrees of freedom: 885.0928 
#> AICc: 2957.617 
#> Residual sum of squares: 868.1985 
#> RMSE: 0.9317717

Estimation of a mgwrsar(0,0,kv) model with stationnary spatial multiplier using an adapative gaussian kernel (based on 20 nearest neighbors):

mgwrsar_0_0_kv<-MGWRSAR(formula = 'Y_mgwrsar_0_kc_kv~X1+X2+X3', data = mydata,coords=coord, fixed_vars=NULL,kernels=c('gauss'),H=20, Model = 'MGWRSAR_0_0_kv',control=list(SE=F,adaptive=T,W=W))
summary(mgwrsar_0_0_kv)
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_0_kc_kv~X1+X2+X3", data = mydata, 
#>     coords = coord, fixed_vars = NULL, kernels = c("gauss"), 
#>     H = 20, Model = "MGWRSAR_0_0_kv", control = list(SE = F, 
#>         adaptive = T, W = W))
#> Model: MGWRSAR_0_0_kv 
#> Method for spatial autocorrelation: 2SLS 
#> Kernels function: gauss 
#> Kernels adaptive: YES 
#> Kernels type: GD 
#> Bandwidth: 20 
#> Computation time: 1.612 
#> Use of parallel computing: FALSE  ncore = 1 
#> Use of rough kernel: NO 
#> Use of Target Points: NO 
#> Number of data points: 1000 
#> Constant parameters: XX 
#> 0.3676871 
#> Varying parameters: Intercept X1 X2 X3 
#>         Intercept        X1        X2      X3
#> Min.    -3.763567  0.080428 -2.209824 -1.2308
#> 1st Qu. -1.219268  0.357217  0.706989 -1.0647
#> Median  -0.618269  0.536381  1.425772 -0.9997
#> Mean    -0.513883  0.499074  1.398492 -1.0014
#> 3rd Qu. -0.017472  0.627690  2.638912 -0.9437
#> Max.     2.108716  0.889871  4.369973 -0.6752
#> Effective degrees of freedom: 905.871 
#> AICc: 1161.224 
#> Residual sum of squares: 151.7064 
#> RMSE: 0.3894951

Estimation of a mgwrsar(0,kc,kv) model with stationary X2 and stationnary spatial multiplier using an adapative gaussian kernel (based on 20 nearest neighbors):

mgwrsar_0_kc_kv<-MGWRSAR(formula = 'Y_mgwrsar_0_kc_kv~X1+X2+X3', data = mydata,coords=coord, fixed_vars='X2',kernels=c('gauss'),H=20, Model = 'MGWRSAR_0_kc_kv',control=list(SE=F,adaptive=T,W=W))
summary(mgwrsar_0_kc_kv)
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_0_kc_kv~X1+X2+X3", data = mydata, 
#>     coords = coord, fixed_vars = "X2", kernels = c("gauss"), 
#>     H = 20, Model = "MGWRSAR_0_kc_kv", control = list(SE = F, 
#>         adaptive = T, W = W))
#> Model: MGWRSAR_0_kc_kv 
#> Method for spatial autocorrelation: 2SLS 
#> Kernels function: gauss 
#> Kernels adaptive: YES 
#> Kernels type: GD 
#> Bandwidth: 20 
#> Computation time: 1.122 
#> Use of parallel computing: FALSE  ncore = 1 
#> Use of rough kernel: NO 
#> Use of Target Points: NO 
#> Number of data points: 1000 
#> Constant parameters: XXX2 XXPhWy 
#> 0.870593 0.4120387 
#> Varying parameters: Intercept X1 X3 
#>         Intercept        X1      X3
#> Min.    -1.254418  0.029588 -1.2702
#> 1st Qu. -0.444425  0.366633 -1.0810
#> Median  -0.033372  0.540500 -1.0026
#> Mean     0.065257  0.498703 -1.0016
#> 3rd Qu.  0.445573  0.627798 -0.9243
#> Max.     2.099437  0.873585 -0.6744
#> Effective degrees of freedom: 914.8315 
#> AICc: 1044.642 
#> Residual sum of squares: 137.9687 
#> RMSE: 0.3714414

Estimation of a mgwrsar(1,kc,kv) model with stationary X2 using an adapative gaussian kernel (based on 20 nearest neighbors):

mgwrsar_1_kc_kv<-MGWRSAR(formula = 'Y_mgwrsar_1_kc_kv~X1+X2+X3', data = mydata,coords=coord, fixed_vars='X2',kernels=c('gauss'),H=20, Model = 'MGWRSAR_1_kc_kv',control=list(SE=F,adaptive=T,W=W))
summary(mgwrsar_1_kc_kv)
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_1_kc_kv~X1+X2+X3", data = mydata, 
#>     coords = coord, fixed_vars = "X2", kernels = c("gauss"), 
#>     H = 20, Model = "MGWRSAR_1_kc_kv", control = list(SE = F, 
#>         adaptive = T, W = W))
#> Model: MGWRSAR_1_kc_kv 
#> Method for spatial autocorrelation: 2SLS 
#> Kernels function: gauss 
#> Kernels adaptive: YES 
#> Kernels type: GD 
#> Bandwidth: 20 
#> Computation time: 2.006 
#> Use of parallel computing: FALSE  ncore = 1 
#> Use of rough kernel: NO 
#> Use of Target Points: NO 
#> Number of data points: 1000 
#> Constant parameters: XX 
#> 2.459628 
#> Varying parameters: Intercept X1 X3  
#>         Intercept        X1        X3  lambda
#> Min.    -2.785597  0.064134 -1.373242 -0.6412
#> 1st Qu. -0.783444  0.371202 -1.092831  0.0467
#> Median  -0.454995  0.534294 -1.006845  0.3538
#> Mean    -0.435127  0.501815 -1.009762  0.3553
#> 3rd Qu. -0.111683  0.633466 -0.933401  0.8027
#> Max.     2.392965  0.910200 -0.617579  0.9900
#> Effective degrees of freedom: 901.0112 
#> AICc: 4933.69 
#> Residual sum of squares: 6519.053 
#> RMSE: 2.553244

Estimation of a mgwrsar(1,kc,0) model with stationary X1 using an adapative gaussian kernel (based on 20 nearest neighbors):

mgwrsar_1_kc_0<-MGWRSAR(formula = 'Y_mgwrsar_1_kc_kv~X1+X2+X3', data = mydata,coords=coord, fixed_vars=c('Intercept','X1','X2','X3'),kernels=c('gauss'),H=20, Model = 'MGWRSAR_1_kc_0',control=list(SE=F,adaptive=T,W=W))
summary(mgwrsar_1_kc_0)
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_1_kc_kv~X1+X2+X3", data = mydata, 
#>     coords = coord, fixed_vars = c("Intercept", "X1", "X2", "X3"), 
#>     kernels = c("gauss"), H = 20, Model = "MGWRSAR_1_kc_0", control = list(SE = F, 
#>         adaptive = T, W = W))
#> Model: MGWRSAR_1_kc_0 
#> Method for spatial autocorrelation: 2SLS 
#> Kernels function: gauss 
#> Kernels adaptive: YES 
#> Kernels type: GD 
#> Bandwidth: 20 
#> Computation time: 1.279 
#> Use of parallel computing: FALSE  ncore = 1 
#> Use of rough kernel: NO 
#> Use of Target Points: NO 
#> Number of data points: 1000 
#> Constant parameters: XXIntercept XXX1 XXX2 XXX3 
#> -0.8205521 0.509378 1.69077 -1.035463 
#> Varying parameters: 
#>          lambda
#> Min.    -0.9137
#> 1st Qu.  0.2291
#> Median   0.6009
#> Mean     0.4822
#> 3rd Qu.  0.9900
#> Max.     0.9995
#> Effective degrees of freedom: 971.1444 
#> AICc: 3905.183 
#> Residual sum of squares: 2736.795 
#> RMSE: 1.654326

Plot and summary

Summary and plot of mgwrsar object using leaflet:


mgwrsar_1_0_kv<-MGWRSAR(formula = 'Y_mgwrsar_1_0_kv~X1+X2+X3', data = mydata,coords=coord,fixed_vars=NULL,kernels=c('gauss'),H=20, Model = 'MGWRSAR_1_0_kv',control=list(SE=T,adaptive=T,W=W))

summary(mgwrsar_1_0_kv)
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_1_0_kv~X1+X2+X3", data = mydata, 
#>     coords = coord, fixed_vars = NULL, kernels = c("gauss"), 
#>     H = 20, Model = "MGWRSAR_1_0_kv", control = list(SE = T, 
#>         adaptive = T, W = W))
#> Model: MGWRSAR_1_0_kv 
#> Method for spatial autocorrelation: 2SLS 
#> Kernels function: gauss 
#> Kernels adaptive: YES 
#> Kernels type: GD 
#> Bandwidth: 20 
#> Computation time: 1.362 
#> Use of parallel computing: FALSE  ncore = 1 
#> Use of rough kernel: NO 
#> Use of Target Points: NO 
#> Number of data points: 1000 
#> Varying parameters: Intercept X1 X2 X3  
#>         Intercept       X1       X2       X3  lambda
#> Min.     -5.55249  0.11207 -1.20584 -1.86027 -0.2375
#> 1st Qu.  -1.47283  0.36956  0.64156 -1.24644  0.1831
#> Median   -0.75406  0.54369  1.72877 -0.99875  0.3845
#> Mean     -0.79631  0.50721  1.75803 -1.00349  0.3568
#> 3rd Qu.  -0.15439  0.63505  2.86983 -0.71468  0.5382
#> Max.      2.06110  0.95748  5.14466 -0.27205  0.8808
#> Effective degrees of freedom: 885.0928 
#> AICc: 2957.617 
#> Residual sum of squares: 868.1985 
#> RMSE: 0.9317717
plot(mgwrsar_1_0_kv,type='B_coef',var='X2', crs=2154,radius=150,myzoom=10)
#plot(mgwrsar_1_0_kv,type='t_coef',var='X2', crs=2154,radius=150,myzoom=10)

Prediction

In this example, we utilize an in-sample of size 800 for model estimation and an out-sample of size 200 for prediction. It’s worth noting that for models with spatial autocorrelation, you must provide a global weight matrix of size 1000, ordered by in-sample and then out-sample locations.

For GWR and MGWR_1_0_kv models, where all coefficients vary in space, predictions are made using the corresponding model estimate with the out-sample locations as target points. Hence, the estimated coefficients are not utilized; only the call and the data of the estimated model are employed (method_pred = ‘TP’, default value). However, for mixed models that include some constant coefficients (such as MGWR, MGXWR_0_0_kv, MGXWR_1_kc_kv, MGXWR_1_kc_0), the estimated coefficients are extrapolated using a weighting matrix. By default, the weighting matrix for prediction is derived from the expected weights of out-sample data if they were added to in-sample data to estimate the corresponding MGWRSAR model: method_pred=‘tWtp_model’ (refer to Geniaux, 2024, for further details). Alternatively, the user can choose to extrapolate the model coefficients using a Shepard kernel with a custom number of neighbors (method_pred=‘shepard’,k_extra=12) or using the same kernel and bandwidth as the estimated model (method_pred=‘model’). In case of extrapolation, predition method without the best linear unbiased predictor (‘type = YTC’) and with the best linear unbiased predictor (‘type = BPN’) are available.

Prediction of MGWRSAR_1_0_kv model using shepard kernel with 4 neighbors and Best Linear Unbiased Predictor (BLUP) :


### Global Spatial Weight matrix W should be ordered by in_sample (S) then out_sample

# in_sample and out_sample folds
length_out=800
index_in=sample(1:1000,length_out)
index_out=(1:1000)[-index_in]

#W=kernel_matW(H=4,kernels='rectangle',coords=coord,NN=4,adaptive=TRUE,diagnull=TRUE)
W_in_out=kernel_matW(H=4,kernels='rectangle',coords=rbind(coord[index_in,],coord[index_out,]),NN=4,adaptive=TRUE,diagnull=TRUE)
W_in=W_in_out[1:length(index_in),1:length(index_in)]
W_in=mgwrsar::normW(W_in)
newdata=mydata[index_out,]
newdata_coords=coord[index_out,]


### SAR Model
model_SAR_insample<-MGWRSAR(formula = 'Y_sar~X1+X2+X3', data = mydata[index_in,],coords=coord[index_in,], fixed_vars=NULL,kernels=NULL,H=NULL, Model = 'SAR',control=list(W=W_in))
model_SAR_insample@RMSE
#> [1] 0.1303956
summary(model_SAR_insample)
#> Call:
#> MGWRSAR(formula = "Y_sar~X1+X2+X3", data = mydata[index_in, ], 
#>     coords = coord[index_in, ], fixed_vars = NULL, kernels = NULL, 
#>     H = NULL, Model = "SAR", control = list(W = W_in))
#> Model: SAR 
#> Method for spatial autocorrelation: 2SLS 
#> Kernels function:  
#> Kernels adaptive: NO 
#> Kernels type:  
#> Bandwidth: NA 
#> Computation time: 0.044 
#> Use of parallel computing: FALSE  ncore = 1 
#> Use of rough kernel:  
#> Use of Target Points: NO 
#> Number of data points: 800 
#> Constant parameters: Intercept X1 X2 X3 lambda 
#> 0.1515706 0.5040549 1.153973 -1.012715 0.3147484 
#> AICc:  
#> Residual sum of squares: 13.6024 
#> RMSE: 0.1303956

## SAR Model: predictions
## without Best Linear Unbiased Predictor
# prediction YTC
Y_pred10=predict(model_SAR_insample, newdata=newdata, newdata_coords=newdata_coords,W=W_in_out,type='YTC')
head(Y_pred10)
#> [1] 2.027912 3.731384 5.910859 2.949735 4.971490 5.277246
RMSE_YTC=sqrt(mean((mydata$Y_sar[index_out]-Y_pred10)^2))
RMSE_YTC #  0.08441952
#> [1] 0.08441952

## Using Best Linear Unbiased Predictor
Y_pred11=predict(model_SAR_insample, newdata=newdata, newdata_coords=newdata_coords,W=W_in_out,type='BPN')
head(Y_pred11)
#> [1] 2.015560 3.680499 5.923186 2.923057 4.940062 5.313581
RMSE_BPN=sqrt(mean((mydata$Y_sar[index_out]-Y_pred11)^2))
RMSE_BPN #   0.06039921
#> [1] 0.06039921


#### MGWRSAR_1_0_kv model
model_MGWRSAR_1_0_kv_insample<-MGWRSAR(formula = 'Y_mgwrsar_1_0_kv~X1+X2+X3', data = mydata[index_in,],coords=coord[index_in,], fixed_vars=NULL,kernels='gauss',H=20, Model = 'MGWRSAR_1_0_kv',control=list(W=W_in,adaptive=TRUE,isgcv=F))
model_MGWRSAR_1_0_kv_insample@RMSE
#> [1] 0.7005766
summary(model_MGWRSAR_1_0_kv_insample)
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_1_0_kv~X1+X2+X3", data = mydata[index_in, 
#>     ], coords = coord[index_in, ], fixed_vars = NULL, kernels = "gauss", 
#>     H = 20, Model = "MGWRSAR_1_0_kv", control = list(W = W_in, 
#>         adaptive = TRUE, isgcv = F))
#> Model: MGWRSAR_1_0_kv 
#> Method for spatial autocorrelation: 2SLS 
#> Kernels function: gauss 
#> Kernels adaptive: YES 
#> Kernels type: GD 
#> Bandwidth: 20 
#> Computation time: 0.971 
#> Use of parallel computing: FALSE  ncore = 1 
#> Use of rough kernel: NO 
#> Use of Target Points: NO 
#> Number of data points: 800 
#> Varying parameters: Intercept X1 X2 X3  
#>         Intercept       X1       X2       X3  lambda
#> Min.     -7.35091  0.11250 -1.03271 -1.92040 -0.2411
#> 1st Qu.  -1.74521  0.37676  1.07924 -1.28170  0.1253
#> Median   -0.91625  0.53936  2.01644 -1.01160  0.2901
#> Mean     -0.97835  0.50956  2.21153 -1.02210  0.2983
#> 3rd Qu.  -0.26127  0.64346  3.42518 -0.71382  0.4631
#> Max.      2.09208  0.93698  8.41085 -0.26197  0.8171
#> Effective degrees of freedom: 705.8388 
#> AICc: 1915.822 
#> Residual sum of squares: 392.6461 
#> RMSE: 0.7005766

#### MGWRSAR_1_0_kv model: predictions

## Using Best Linear Unbiased Predictor with method_pred='model' 
Y_pred12=predict(model_MGWRSAR_1_0_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W_in_out,type='BPN',method_pred='model' )
head(Y_pred12)
#> [1] 0.6988705 3.7068233 5.6428853 2.6763307 6.5104905 4.6867043
RMSE_model_BPN=sqrt(mean((mydata$Y_mgwrsar_1_0_kv[index_out]-Y_pred12)^2))
RMSE_model_BPN #  0.435830
#> [1] 0.4358303

# prediction with method_pred='tWtp'
Y_pred13=predict(model_MGWRSAR_1_0_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W_in_out,type='BPN',method_pred='tWtp_model')
head(Y_pred13)
#> [1] 0.7317686 3.6460268 5.6538075 2.6628929 6.5911347 4.8190344
RMSE_tWtp_BPN=sqrt(mean((mydata$Y_mgwrsar_1_0_kv[index_out]-Y_pred13)^2))
RMSE_tWtp_BPN #   0.4544316
#> [1] 0.4544316

## Using Best Linear Unbiased Predictor with method_pred='shepard' 
W_out=kernel_matW(H=4,kernels='rectangle',coords=coord[index_out,],NN=4,adaptive=TRUE,diagnull=TRUE)
Y_pred14=predict(model_MGWRSAR_1_0_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W_in_out,type='BPN',method_pred='shepard',k_extra=8)
head(Y_pred14)
#> [1] 0.7715255 3.4637052 5.6434874 2.7065372 6.5409656 4.7106552
RMSE_shepard_BPN=sqrt(mean((mydata$Y_mgwrsar_1_0_kv[index_out]-Y_pred14)^2))
RMSE_shepard_BPN # 0.512511
#> [1] 0.512511

Prediction of MGWRSAR_1_kc_kv model using shepard kernel with 4 neighbors and Best Linear Unbiased Predictor (BLUP) :


### Global Spatial Weight matrix W should be ordered by in_ sample (S) then out_sample

# in_sample and out_sample folds
length_out=800
index_in=sample(1:1000,length_out)
index_out=(1:1000)[-index_in]

W=kernel_matW(H=4,kernels='rectangle',coords=rbind(coord[index_in,],coord[index_out,]),NN=4,adaptive=TRUE,diagnull=TRUE)
W_in=W[index_in,index_in]
W_in=mgwrsar::normW(W_in)

### model_MGWRSAR_1_kc_kv
model_MGWRSAR_1_kc_kv_insample<-MGWRSAR(formula = 'Y_mgwrsar_1_kc_kv~X1+X2+X3', data = mydata[index_in,],coords=coord[index_in,], fixed_vars='X3',kernels=c('gauss'),H=11, Model = 'MGWRSAR_1_kc_kv',control=list(W=W_in,adaptive=TRUE,isgcv=F))
model_MGWRSAR_1_kc_kv_insample@RMSE
#> [1] 0.3502106
summary(model_MGWRSAR_1_kc_kv_insample)
#> Call:
#> MGWRSAR(formula = "Y_mgwrsar_1_kc_kv~X1+X2+X3", data = mydata[index_in, 
#>     ], coords = coord[index_in, ], fixed_vars = "X3", kernels = c("gauss"), 
#>     H = 11, Model = "MGWRSAR_1_kc_kv", control = list(W = W_in, 
#>         adaptive = TRUE, isgcv = F))
#> Model: MGWRSAR_1_kc_kv 
#> Method for spatial autocorrelation: 2SLS 
#> Kernels function: gauss 
#> Kernels adaptive: YES 
#> Kernels type: GD 
#> Bandwidth: 11 
#> Computation time: 1.28 
#> Use of parallel computing: FALSE  ncore = 1 
#> Use of rough kernel: NO 
#> Use of Target Points: NO 
#> Number of data points: 800 
#> Constant parameters: XX 
#> -1.042863 
#> Varying parameters: Intercept X1 X2  
#>         Intercept        X1        X2  lambda
#> Min.    -10.72384   0.08488  -0.77331 -0.2800
#> 1st Qu.  -1.70402   0.35135   2.04683 -0.0410
#> Median   -0.78341   0.53193   3.01252 -0.0047
#> Mean     -0.99026   0.50042   3.42218 -0.0116
#> 3rd Qu.   0.06569   0.63682   5.18079  0.0268
#> Max.      1.62136   0.98688  12.43667  0.1318
#> Effective degrees of freedom: 666.3673 
#> AICc: 914.0958 
#> Residual sum of squares: 98.11797 
#> RMSE: 0.3502106

### model_MGWRSAR_1_kc_kv: predictions
## without Best Linear Unbiased Predictor
newdata=mydata[index_out,]
newdata_coords=coord[index_out,]
newdata$Y_mgwrsar_1_kc_kv=0

Y_pred15=predict(model_MGWRSAR_1_kc_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W,type='YTC',method_pred='shepard')
head(Y_pred15)
#> [1] 0.5010164 3.6370096 5.8722300 2.5841532 7.9874040 4.0912315
RMSE_YTC=sqrt(mean((mydata$Y_mgwrsar_1_kc_kv[index_out]-Y_pred15)^2))
RMSE_YTC # 0.4274275
#> [1] 0.4274275

## Using Best Linear Unbiased Predictor
Y_pred16=predict(model_MGWRSAR_1_kc_kv_insample, newdata=newdata, newdata_coords=newdata_coords,W=W,type='BPN',method_pred='shepard')
head(Y_pred16)
#> [1] 0.5044271 3.8406314 5.8762162 2.5846512 7.9916212 4.0910314
RMSE_BPN=sqrt(mean((mydata$Y_mgwrsar_1_kc_kv[index_out]-Y_pred16)^2))
RMSE_BPN #  0.4416827
#> [1] 0.4416827

References

Brunsdon, C., Fotheringham, A., Charlton, M., 1996. Geographically weighted regression: a method for exploring spatial nonstationarity. Geogr. Anal. 28 (4), 281–298.

Friedman, J. H. 1984. A variable span smoother. Laboratory for Computational Statistics, Department of Statistics, Stanford University: Technical Report(5).

Geniaux, G. and Martinetti, D. 2018. A new method for dealing simultaneously with spatial autocorrelation and spatial heterogeneity in regression models. Regional Science and Urban Economics, 72, 74-85. https://doi.org/10.1016/j.regsciurbeco.2017.04.001

Geniaux, G. and Martinetti, D. 2017b. R package mgwrsar: Functions for computing (mixed) geographycally weighted regression with spatial autocorrelation,. https://CRAN.R-project.org/package=mgwrsar.

Geniaux, G. 2024. Speeding up estimation of spatially varying coefficients models. Jounal of Geographical System 26, 293–327. https://doi.org/10.1007/s10109-024-00442-3

Loader, C. 1999. Local regression and likelihood, volume 47. springer New York.

McMillen, D. P. 1996. One hundred fifty years of land values in chicago: A nonparametric approach. Journal of Urban Economics, 40(1):100 – 124.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.