The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

h5lite h5lite logo

cran conda covr

h5lite is the pain-free way to work with HDF5 files in R.

It is designed for data scientists who want to read/write objects and move on, and for package developers who need a reliable, dependency-free storage backend.

Why h5lite?

If you’ve struggled with complex HDF5 bindings in the past, h5lite offers a fresh approach:

  1. It Just Works: No need to understand HDF5 dataspaces, hyperslabs, or property lists. h5lite maps R objects (numeric, character, factor, data.frame, and more) directly to their HDF5 equivalents.

  2. Zero System Dependencies: h5lite bundles the HDF5 library (via hdf5lib). Users do not need to install HDF5 system libraries manually.

  3. Smart Defaults, Full Control: It automatically selects the most efficient data types (e.g., saving space by storing small integers as int8), but gives you granular control when you need to conform to a strict spec.

Installation

Install the released version from CRAN:

install.packages("h5lite")

Or the development version from GitHub:

# install.packages("pak")
pak::pak("cmmr/h5lite")

Quick Start

The API consists primarily of two functions: h5_write() and h5_read().

library(h5lite)
file <- tempfile(fileext = ".h5")

# 1. Write simple objects
h5_write(1:10, file, "my_vector")
h5_write(I(42), file, "my_vector", attr = "my_id")
h5_write(matrix(rnorm(9), 3, 3), file, "my_matrix")

# 2. Write a list (creates a group hierarchy)
config <- list(version = 1.0, params = list(a = 1, b = 2))
h5_write(config, file, "simulation_config")

# 3. Read it back
my_vec <- h5_read(file, "my_vector")

# 4. Inspect the file
h5_ls(file)
#> [1] "my_vector"                  "my_matrix"
#> [3] "simulation_config"          "simulation_config/version"
#> [5] "simulation_config/params"   "simulation_config/params/a"
#> [7] "simulation_config/params/b"

h5_str(file)
#> /
#> ├── my_vector <uint8 × 10>
#> │   └── @my_id <uint8 scalar>
#> ├── my_matrix <float64 × 3 × 3>
#> └── simulation_config/
#>     ├── version <uint8 × 1>
#>     └── params/
#>         ├── a <uint8 × 1>
#>         └── b <uint8 × 1>

Power User Features

The as Argument: Precise Control

Need to conform to a specific file specification? The as argument allows you to override automatic behavior and explicitly define on-disk types.

# Force specific numeric types
h5_write(1:10, file, "dataset_a", as = "int32")
h5_write(rnorm(10), file, "dataset_b", as = "float32")

# Control string lengths (e.g., fixed-length ASCII for compatibility)
h5_write(c("A", "B"), file, "fixed_strs", as = "ascii[10]")

h5_str(file)
#> ...
#> ├── dataset_a <int32 × 10>
#> ├── dataset_b <float32 × 10>
#> └── fixed_strs <ascii[10] × 2>

When writing Data Frames, you can map types for specific columns using a named vector.

df <- data.frame(
  id    = 1:5, 
  score = c(1.1, 2.2, 3.3, 4.4, 5.5),
  note  = c("a", "b", "c", "d", "e")
)

# Store 'id' as 16-bit integer, 'score' as 32-bit float, and coerce 'note' to ascii
h5_write(df, file, "experiment_data", 
         as = c(id = "uint16", score = "float32", note = "ascii"))

h5_str(file)
#> ...
#> └── experiment_data <compound[3] × 5>
#>     ├── $id <uint16>
#>     ├── $score <float32>
#>     └── $note <ascii>

Advanced Compression & Chunking

h5lite natively bundles an extensive suite of state-of-the-art compression filters (including Blosc2, Zstandard, LZ4, and lossy ZFP). For simple use cases, you can pass a string configuration to the compress argument. For precise control over the pipeline—including chunk sizing, bitshuffling, and Scale-Offset scaling—use the h5_compression() function.

# Simple setup: High-performance Blosc2 with Zstandard
h5_write(rnorm(1000), file, "data_blosc", compress = "blosc2-zstd")

# Advanced setup: LZ4 compression, optimal integer packing, and custom chunk sizing
cmp <- h5_compression("lz4-9", int_packing = TRUE, chunk_size = 512 * 1024)
h5_write(1:1000, file, "data_custom", compress = cmp)

Efficient Partial Reading

For large datasets that exceed system RAM, h5lite provides partial reading via start and count parameters. It automatically targets the most logical dimension (e.g., rows in a matrix or elements in a vector).

# Read a 100-row slice starting from row 500
subset <- h5_read(file, "large_matrix", start = 500, count = 100)

Comparison

Feature h5lite rhdf5 / hdf5r
Philosophy “Opinionated” & Simple Comprehensive Wrapper
API Style Native R (read/write) Low-level (Files, Dataspaces, Memspaces)
HDF5 Installation Bundled (Zero-config) System Requirement (Manual install often required)
Data Typing Automatic (safe defaults) Manual (user specified)
Partial I/O Supported (Simplified) Supported (Manual hyperslabs)
Learning Curve Low (Minutes) High (Days)

Use rhdf5 or hdf5r if you need to:

Use h5lite if you want to:

Documentation

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.