The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

CRAN status CRAN downloads Ask DeepWiki

sshist

The sshist package implements the Shimazaki-Shinomoto method for finding the optimal number of bins in histograms.

Unlike the standard Freedman-Diaconis rule (used by default in ggplot2), this method minimizes the expected L2 loss function between the histogram and the unknown underlying density function. It is particularly effective for:

Installation

# stable version from CRAN
install.packages("sshist")

You can install the development version of sshist like so:

# install.packages("devtools")
devtools::install_github("celebithil/sshist")

Example 1: Basic 1D Usage

Here is a basic example using the Old Faithful Geyser data.

library(sshist)

# Load data
data(faithful)
x_data <- faithful$waiting

# Calculate optimal binning
res <- sshist(x_data)

# Print summary
print(res)
#> Shimazaki-Shinomoto Histogram Optimization
#> ------------------------------------------
#> Optimal Bins (N): 37 
#> Bin Width (D):    1.432 
#> Cost Minimum:     -9.681

hist(res$data, breaks=res$edges, freq=FALSE,
       main=paste("Optimal Hist (N=", res$opt_n, ")"),
       col="lightblue", border="white", xlab="Data")

Example 2: Integration with ggplot2

sshist calculates the optimal parameters, which you can easily pass to ggplot2.

library(ggplot2)

# Create a data frame
df <- data.frame(waiting = x_data)

ggplot(df, aes(x = waiting)) +
  geom_histogram(breaks = res$edges, fill = "#69b3a2", color = "white", alpha = 0.8) +
  geom_rug(alpha = 0.1) +
  ggtitle(paste0("Shimazaki-Shinomoto Optimization (N = ", res$opt_n, ")")) +
  theme_minimal()

Example 3: 2D Histogram Optimization

For bivariate data, sshist_2d finds the optimal binning for both X and Y axes simultaneously.

# Get bimodal 2D data
y_data <- faithful$eruptions

# Optimize
res2d <- sshist_2d(x_data, y_data)

# Print summary
print(res2d)
#> Shimazaki-Shinomoto 2D Histogram Optimization
#> ---------------------------------------------
#> Optimal Bins X:   9 
#> Optimal Bins Y:   20 
#> Bin Width X:      5.889 
#> Bin Width Y:      0.175 
#> Cost Minimum:     -5.717

Example 4: 2D Optimization with ggplot2

You can easily use the optimized bin counts from sshist_2d in ggplot2 by passing them to the bins argument in geom_bin2d.

# We use the 'res2d' object calculated in Example 3
# containing optimal bins for Old Faithful data

res2d <- sshist_2d(faithful$waiting, faithful$eruptions )

ggplot(faithful, aes(waiting, eruptions)) +
  geom_bin2d(bins = c(res2d$opt_nx, res2d$opt_ny)) +
  scale_fill_distiller(palette = "Spectral") +
  labs(
    title = "Optimal 2D Binning (Old Faithful)",
    subtitle = paste0("Shimazaki-Shinomoto Method: ", 
                      res2d$opt_nx, " x ", res2d$opt_ny, " bins"),
    x = "Waiting Time (min)",
    y = "Eruption Duration (min)"
  ) +
  theme(axis.text = element_text(size = 12),
        title = element_text(size = 12,face="bold"),
        panel.border = element_rect(linewidth = 2, color = "black", fill = NA))

References

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.