The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Getting started with GLBFP

This vignette shows the basic workflow for pointwise and grid-based density estimation with GLBFP. It is intentionally practical. For a conceptual map of the full package, see the “Package overview and workflow map” vignette.

Installation

install.packages("remotes")
remotes::install_github("AurelienNicosiaULaval/GLBFP")

Load the package

library(GLBFP)

Estimate density at one point

The functions ASH(), LBFP(), and GLBFP() estimate the density at a single point. The data are supplied as a numeric matrix or data frame with observations in rows.

x <- matrix(rnorm(300), ncol = 1)
b <- compute_bi_optim(x, m = 1)

fit <- GLBFP(x = 0, data = x, b = b, m = 1)
fit
#> GLBFP Density Estimation:
#> Point: (0) 
#> Estimated density: 0.386735 
#> Estimated standard error: 0.0838592 
#> 95% confidence interval: 0.362643, 0.410827 
#> Bandwidths (b): 0.155145 
#> Shifts (m): 1 
#> Relative grid coordinate (u): 0.916139

Lowercase aliases are also available for interactive workflows. They call the same estimators and keep the uppercase API unchanged.

fit_alias <- glbfp(x = 0, data = x, b = b, m = 1)
identical(fit$estimation, fit_alias$estimation)
#> [1] TRUE

The returned object contains the evaluation point, the estimated density, the bandwidth, and the shift parameter. For LBFP() and GLBFP(), uncertainty components are also returned.

names(fit)
#>  [1] "x"            "estimation"   "sd"           "IC"           "b"           
#>  [6] "m"            "method"       "dimension"    "u"            "cell_index"  
#> [11] "visited"      "prefix_nodes" "prefix_order"
summary(fit)
#> Method: GLBFP 
#> Dimension: 1 
#> Point: 0 
#> Estimation: 0.3867351 
#> Standard error: 0.0838592 
#> 95% CI: 0.362643259199835, 0.410826868862035 
#> Bandwidths (b): 0.155144970240404 
#> Shifts (m): 1 
#> Relative grid coordinate (u): 0.91613931912711 
#> Visited cells: 2 
#> Prefix nodes: 2
predict(fit)
#> [1] 0.3867351

Estimate density on a grid

The *_estimate() functions evaluate the same estimator over a regular grid or a user-supplied set of points.

grid_fit <- GLBFP_estimate(data = x, b = b, m = 1, grid_size = 80)
head(cbind(grid_fit$grid, density = grid_fit$densities))
#>             V1     density
#> [1,] -2.546881 0.085941125
#> [2,] -2.481241 0.067760688
#> [3,] -2.415601 0.049580251
#> [4,] -2.349960 0.031399814
#> [5,] -2.284320 0.017352329
#> [6,] -2.218680 0.008262111
head(as.data.frame(grid_fit))
#>          V1     density          sd    IC_lower   IC_upper visited prefix_nodes
#> 1 -2.546881 0.085941125 0.067942425 0.066422031 0.10546022       2            2
#> 2 -2.481241 0.067760688 0.041194890 0.055925860 0.07959552       2            2
#> 3 -2.415601 0.049580251 0.024146032 0.042643368 0.05651713       2            2
#> 4 -2.349960 0.031399814 0.020860213 0.025406910 0.03739272       2            2
#> 5 -2.284320 0.017352329 0.016030533 0.012746937 0.02195772       1            2
#> 6 -2.218680 0.008262111 0.009668979 0.005484322 0.01103990       1            2

For one-dimensional grids, the plot method returns a ggplot2 object.

plot(grid_fit)

Using the included data

The ashua data contain daily flow and level observations for the Ashuapmushuan river. The example below uses a small grid so the vignette remains fast.

data("ashua")

river_data <- ashua[, c("flow", "level")]
b2 <- c(8, 0.4)
x0 <- c(mean(river_data$flow), mean(river_data$level))

fit2 <- GLBFP(x = x0, data = river_data, b = b2, m = c(1, 1))
fit2
#> GLBFP Density Estimation:
#> Point: (249.0230, 30.4197) 
#> Estimated density: 0.00377442 
#> Estimated standard error: 0.000316735 
#> 95% confidence interval: 0.00376918, 0.00377966 
#> Bandwidths (b): 8.0, 0.4 
#> Shifts (m): 1, 1 
#> Relative grid coordinate (u): 0.690325, 0.224216

grid_fit2 <- GLBFP_estimate(
  data = river_data,
  b = b2,
  m = c(1, 1),
  grid_size = 15
)

plot(grid_fit2, contour = TRUE)

Input expectations

The package expects finite numeric data. Missing values should be removed or imputed before estimation. Constant data require explicit non-degenerate min_vals and max_vals bounds because a density estimator for continuous data needs a positive evaluation range.

Where to go next

After this vignette:

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.