The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Density Estimation with rbbnp

This article provides a comprehensive guide to density estimation using the biasBound_density() function.

The biasBound_density() Function

The main function for density estimation takes the following key arguments:

biasBound_density(
  X,                    # Data vector
  x = NULL,             # Evaluation points (auto if NULL)
  h = NULL,             # Bandwidth (auto-selected if NULL)
  h_method = "cv",      # "cv" or "silverman"
  alpha = 0.05,         # Confidence level
  kernel.fun = "Schennach2004",
  xi_lb = NULL,         # Frequency range lower bound
  xi_ub = NULL          # Frequency range upper bound
)

Basic Example

# Generate sample data from 2-fold uniform distribution
X <- gen_sample_data(size = 1000, dgp = "2_fold_uniform", seed = 42)

# Estimate density
fit <- biasBound_density(X, h = 0.08, kernel.fun = "Schennach2004")

# Display results
fit
#> Bias-Bounded Density Estimation
#> 
#> Call:
#> biasBound_density(X = X, h = 0.08, kernel.fun = "Schennach2004")
#> 
#> Sample size: n = 1000
#> Bandwidth:   h = 0.0800 (user-specified) 
#> Kernel:      Schennach2004 
#> 
#> Bias bound parameters:
#>   A = 6.8980, r = 2.5142
#>   bias bound b1x = 0.0334
#> 
#> Evaluation points: 100 (range: [-0.1746, 2.1780])
#> Confidence level: 95%
#> 
#> Use summary() for detailed statistics
#> Use plot() to visualize results

Understanding the Output

The result is an S3 object of class bbnp_density containing:

# Key parameters
coef(fit)
#>        A        r        h 
#> 6.898006 2.514158 0.080000

# Detailed summary
summary(fit)
#> Summary: Bias-Bounded Density Estimation
#> ============================================================
#> 
#> Call:
#> biasBound_density(X = X, h = 0.08, kernel.fun = "Schennach2004")
#> 
#> Sample Information:
#>   Sample size (n):  1000
#>   Bandwidth (h):    0.0800
#>   Kernel function:  Schennach2004
#> 
#> Bias Bound Parameters:
#>   A (amplitude):    6.8980
#>   r (decay rate):   2.5142
#>   b1x (bias bound): 0.0334
#>   Xi interval:      [2.4074, 4.5544]
#> 
#> Range Estimation:
#>   Density estimates:
#>    min Q1.25% median   mean Q3.75%    max 
#> 0.0000 0.1111 0.4273 0.4217 0.7125 0.8798 
#> 
#>   Standard errors:
#>    min   mean    max 
#> 0.0000 0.0341 0.0563

Visualizing Results

Density Plot

plot(fit)

Interpreting the plot:

Element Description
Estimate Kernel density estimate \(\hat{f}(x)\)
Bias bound Bias range: \([\hat{f} - \bar{b}, \hat{f} + \bar{b}]\)
95% CI \([\hat{f} - \bar{b} - z\hat{\sigma}, \hat{f} + \bar{b} + z\hat{\sigma}]\)

Fourier Transform Analysis

The package automatically detects smoothness via Fourier analysis:

plot(fit, type = "ft")

Interpreting the Fourier plot (the legend labels each line):

The slope \(r\) indicates smoothness: larger \(r\) = smoother function.

Bandwidth Selection

Cross-Validation

fit_cv <- biasBound_density(X, h = NULL, h_method = "cv", kernel.fun = "normal")
cat("CV bandwidth:", coef(fit_cv)["h"], "\n")
#> CV bandwidth: 0.1878156

plot(fit_cv) + ggtitle("Cross-Validation Bandwidth")

Silverman’s Rule of Thumb

fit_silv <- biasBound_density(X, h = NULL, h_method = "silverman", kernel.fun = "normal")
cat("Silverman bandwidth:", coef(fit_silv)["h"], "\n")
#> Silverman bandwidth: 0.09390781

plot(fit_silv) + ggtitle("Silverman's Rule Bandwidth")

Direct Bandwidth Selection

# Select bandwidth without full estimation
h_cv <- select_bandwidth(X, method = "cv", kernel.fun = "normal")
h_silv <- select_bandwidth(X, method = "silverman", kernel.fun = "normal")

cat("CV:", h_cv, "\nSilverman:", h_silv)
#> CV: 0.1878156 
#> Silverman: 0.09390781

Kernel Functions

Available Kernels

Kernel Order Best For
Schennach2004 Infinite High smoothness (recommended)
sinc Infinite Any smoothness level
normal 2 Moderate smoothness
epanechnikov 2 Limited smoothness

Comparing Kernels

fit_sch <- biasBound_density(X, kernel.fun = "Schennach2004")
fit_sinc <- biasBound_density(X, kernel.fun = "sinc")
fit_norm <- biasBound_density(X, kernel.fun = "normal")
fit_epan <- biasBound_density(X, kernel.fun = "epanechnikov")

grid.arrange(
  plot(fit_sch) + ggtitle("Schennach2004 (infinite order)"),
  plot(fit_sinc) + ggtitle("Sinc (infinite order)"),
  plot(fit_norm) + ggtitle("Normal (2nd order)"),
  plot(fit_epan) + ggtitle("Epanechnikov (2nd order)"),
  ncol = 2
)

Recommendation: Use "Schennach2004" for most applications as it adapts to any smoothness level.

Custom Frequency Range

You can manually specify the frequency range for Fourier analysis:

# Default (automatic)
fit_auto <- biasBound_density(X, h = 0.08)

# Custom range
fit_custom <- biasBound_density(X, h = 0.08, xi_lb = 2, xi_ub = 8)

grid.arrange(
  plot(fit_auto, type = "ft") + ggtitle("Automatic Frequency Range"),
  plot(fit_custom, type = "ft") + ggtitle("Custom Range [2, 8]"),
  ncol = 1
)

Extracting Results

# Confidence intervals as matrix
ci <- confint(fit)
head(ci, 10)
#>       lower      upper
#>  [1,]     0 0.03341978
#>  [2,]     0 0.03341978
#>  [3,]     0 0.03341978
#>  [4,]     0 0.03341978
#>  [5,]     0 0.03341978
#>  [6,]     0 0.04274476
#>  [7,]     0 0.05896604
#>  [8,]     0 0.07625691
#>  [9,]     0 0.09584398
#> [10,]     0 0.11785186

# Evaluation points
x_points <- fit$x
head(x_points)
#> [1] -0.1746134 -0.1508492 -0.1270850 -0.1033208 -0.0795566 -0.0557924

# Density estimates
f_hat <- fit$density
head(f_hat)
#> [1] 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.002945549

Comparison with Undersmoothing

The bias-bound approach yields narrower CIs than undersmoothing:

h_opt <- select_bandwidth(X, method = "cv", kernel.fun = "sinc")

# Bias-bound with optimal bandwidth
result_opt <- biasBound_density(X, h = h_opt, kernel.fun = "sinc")

# Undersmoothing (half optimal)
result_under <- biasBound_density(X, h = h_opt * 0.5, kernel.fun = "sinc")

grid.arrange(
  plot(result_opt) + ggtitle(paste0("Bias Bound (h = ", round(h_opt, 3), ")")),
  plot(result_under) + ggtitle(paste0("Undersmoothing (h = ", round(h_opt/2, 3), ")")),
  ncol = 1
)

The bias-bound approach provides valid coverage while maintaining efficient estimation.

See Also

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.