This vignette shows the basic workflow for pointwise and grid-based
density estimation with GLBFP. It is intentionally
practical. For a conceptual map of the full package, see the “Package
overview and workflow map” vignette.
The functions ASH(), LBFP(), and
GLBFP() estimate the density at a single point. The data
are supplied as a numeric matrix or data frame with observations in
rows.
x <- matrix(rnorm(300), ncol = 1)
b <- compute_bi_optim(x, m = 1)
fit <- GLBFP(x = 0, data = x, b = b, m = 1)
fit
#> GLBFP Density Estimation:
#> Point: (0)
#> Estimated density: 0.386735
#> Estimated standard error: 0.0838592
#> 95% confidence interval: 0.362643, 0.410827
#> Bandwidths (b): 0.155145
#> Shifts (m): 1
#> Relative grid coordinate (u): 0.916139Lowercase aliases are also available for interactive workflows. They call the same estimators and keep the uppercase API unchanged.
fit_alias <- glbfp(x = 0, data = x, b = b, m = 1)
identical(fit$estimation, fit_alias$estimation)
#> [1] TRUEThe returned object contains the evaluation point, the estimated
density, the bandwidth, and the shift parameter. For LBFP()
and GLBFP(), uncertainty components are also returned.
names(fit)
#> [1] "x" "estimation" "sd" "IC" "b"
#> [6] "m" "method" "dimension" "u" "cell_index"
#> [11] "visited" "prefix_nodes" "prefix_order"
summary(fit)
#> Method: GLBFP
#> Dimension: 1
#> Point: 0
#> Estimation: 0.3867351
#> Standard error: 0.0838592
#> 95% CI: 0.362643259199835, 0.410826868862035
#> Bandwidths (b): 0.155144970240404
#> Shifts (m): 1
#> Relative grid coordinate (u): 0.91613931912711
#> Visited cells: 2
#> Prefix nodes: 2
predict(fit)
#> [1] 0.3867351The *_estimate() functions evaluate the same estimator
over a regular grid or a user-supplied set of points.
grid_fit <- GLBFP_estimate(data = x, b = b, m = 1, grid_size = 80)
head(cbind(grid_fit$grid, density = grid_fit$densities))
#> V1 density
#> [1,] -2.546881 0.085941125
#> [2,] -2.481241 0.067760688
#> [3,] -2.415601 0.049580251
#> [4,] -2.349960 0.031399814
#> [5,] -2.284320 0.017352329
#> [6,] -2.218680 0.008262111
head(as.data.frame(grid_fit))
#> V1 density sd IC_lower IC_upper visited prefix_nodes
#> 1 -2.546881 0.085941125 0.067942425 0.066422031 0.10546022 2 2
#> 2 -2.481241 0.067760688 0.041194890 0.055925860 0.07959552 2 2
#> 3 -2.415601 0.049580251 0.024146032 0.042643368 0.05651713 2 2
#> 4 -2.349960 0.031399814 0.020860213 0.025406910 0.03739272 2 2
#> 5 -2.284320 0.017352329 0.016030533 0.012746937 0.02195772 1 2
#> 6 -2.218680 0.008262111 0.009668979 0.005484322 0.01103990 1 2For one-dimensional grids, the plot method returns a
ggplot2 object.
The ashua data contain daily flow and level observations
for the Ashuapmushuan river. The example below uses a small grid so the
vignette remains fast.
data("ashua")
river_data <- ashua[, c("flow", "level")]
b2 <- c(8, 0.4)
x0 <- c(mean(river_data$flow), mean(river_data$level))
fit2 <- GLBFP(x = x0, data = river_data, b = b2, m = c(1, 1))
fit2
#> GLBFP Density Estimation:
#> Point: (249.0230, 30.4197)
#> Estimated density: 0.00377442
#> Estimated standard error: 0.000316735
#> 95% confidence interval: 0.00376918, 0.00377966
#> Bandwidths (b): 8.0, 0.4
#> Shifts (m): 1, 1
#> Relative grid coordinate (u): 0.690325, 0.224216
grid_fit2 <- GLBFP_estimate(
data = river_data,
b = b2,
m = c(1, 1),
grid_size = 15
)
plot(grid_fit2, contour = TRUE)The package expects finite numeric data. Missing values should be
removed or imputed before estimation. Constant data require explicit
non-degenerate min_vals and max_vals bounds
because a density estimator for continuous data needs a positive
evaluation range.
After this vignette: