The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This vignette covers common zarr array operations: persistent storage, compression, resizing, filters, and advanced indexing.
Create an array on disk, close the session, and reopen it later.
path <- file.path(tempdir(), "example.zarr")
# Create a persistent array backed by a DirectoryStore
z <- zarr_open_array(
store = path, mode = "w",
shape = c(5, 10), chunks = c(5, 5), dtype = "<f4"
)
# Write data
z$set_item("...", array(1:50, dim = c(5, 10)))
#> NULL
z$get_shape()
#> [1] 5 10Reopen the same path in read mode:
z2 <- zarr_open_array(store = path, mode = "r")
z2$get_shape()
#> [1] 5 10
z2$get_item("...")$data
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 1 6 11 16 21 26 31 36 41 46
#> [2,] 2 7 12 17 22 27 32 37 42 47
#> [3,] 3 8 13 18 23 28 33 38 43 48
#> [4,] 4 9 14 19 24 29 34 39 44 49
#> [5,] 5 10 15 20 25 30 35 40 45 50For quick save/load of an existing array:
save_path <- file.path(tempdir(), "saved.zarr")
# Save an R array directly
zarr_save_array(save_path, zarr_create_array(
data = volcano, shape = dim(volcano), dtype = "<f8"
))
#> <ZarrArray> /
#> Shape : (87, 61)
#> Chunks : (87, 61)
#> Data type : <f8
#> Fill value : 0
#> Order : F
#> Read-only : FALSE
#> Compressor : ZstdCodec
#> Store type : DirectoryStore
#> Zarr format : 2
# Reopen
z3 <- zarr_open_array(save_path, mode = "r")
all.equal(z3$as.array(), volcano)
#> [1] TRUEBy default, pizzarr uses Zstandard compression. You can choose a different compressor when creating an array.
Gzip compression is interoperable with zarr-python and other
implementations, but is slower than Zstandard because R lacks an
in-memory gzip API. For best write performance, prefer
ZstdCodec.
Arrays can be resized after creation. Data in the overlapping region is preserved; new regions are filled with the fill value.
z <- zarr_create(
shape = c(5, 10), chunks = c(5, 5),
dtype = "<i4", fill_value = 0L,
compressor = "default"
)
z$set_item("...", array(1:50, dim = c(5, 10)))
#> NULL
z$get_shape()
#> [1] 5 10
# Grow the array
z$resize(10, 20)
z$get_shape()
#> [1] 10 20
# Original data is preserved in the top-left corner
z[1:5, 1:10]$data
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 1 6 11 16 21 26 31 36 41 46
#> [2,] 2 7 12 17 22 27 32 37 42 47
#> [3,] 3 8 13 18 23 28 33 38 43 48
#> [4,] 4 9 14 19 24 29 34 39 44 49
#> [5,] 5 10 15 20 25 30 35 40 45 50
# New region is filled with fill_value
z[6:10, 1:5]$data
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0 0 0 0 0
#> [2,] 0 0 0 0 0
#> [3,] 0 0 0 0 0
#> [4,] 0 0 0 0 0
#> [5,] 0 0 0 0 0Shrinking removes chunks that fall outside the new shape:
Use append() to grow an array along an axis, adding new
data at the end. This is equivalent to zarr-python’s
z.append(data, axis=0), but uses R’s 1-based axis indexing
(axis 1 = first dimension).
z <- zarr_create(
shape = c(3, 4), chunks = c(3, 4),
dtype = "<i4", fill_value = 0L
)
z$set_item("...", array(1:12, dim = c(3, 4)))
#> NULL
z$as.array()
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12Append new rows (axis 1, the default):
new_rows <- array(13:20, dim = c(2, 4))
z$append(new_rows)
#> NULL
z$get_shape()
#> [1] 5 4
z$as.array()
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12
#> [4,] 13 15 17 19
#> [5,] 14 16 18 20Append new columns (axis 2):
Filters transform chunk data before compression. They are codec
instances passed as a list to the filters parameter. A
common use case is variable-length UTF-8 string arrays, which require
VLenUtf8Codec as a filter.
words <- c("alpha", "bravo", "charlie", "delta")
z_str <- zarr_create_array(
data = array(words, dim = length(words)),
shape = length(words), dtype = "|O",
object_codec = VLenUtf8Codec$new()
)
z_str$get_item("...")$data
#> [1] "alpha" "bravo" "charlie" "delta"
z_str$get_filters()
#> [[1]]
#> <VLenUtf8Codec>
#> Inherits from: <Codec>
#> Public:
#> clone: function (deep = FALSE)
#> decode: function (buf, zarr_arr)
#> encode: function (buf, zarr_arr)
#> get_config: function ()Beyond basic slicing with slice() or [,
pizzarr supports orthogonal indexing for independent selection along
each dimension.
[The bracket operator uses orthogonal indexing internally:
Select specific rows and columns independently. Note that
get_orthogonal_selection uses zero-based indices (like
zarr-python), while the [ operator uses R’s one-based
indexing:
Select dimensions using logical vectors:
The $get_oindex() accessor provides the same orthogonal
indexing:
Select every other row, every third column using seq()
in bracket notation:
"..." selects all remaining dimensions; ":"
selects all along one dimension. These work with
get_item():
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.