The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
quak provides convenient utilities for using DuckDB with
datasets stored in Azure Data Lake Storage Gen2 (abfss://).
It opens connections configured for Azure-backed Delta Lake data,
registers Azure credentials as DuckDB secrets, and works with both SQL
via DBI and lazy table queries via dplyr and dbplyr.
quak is not yet on CRAN. Install the development version
from GitHub:
# install.packages("remotes")
remotes::install_github("pedrobtz/quak")Start by opening an Azure-ready DuckDB connection. Then register one Azure secret on that connection.
library(quak)
conn <- az_conn()Use a credential chain when you want DuckDB to resolve credentials itself, for example from the Azure CLI:
library(quak)
conn <- az_conn()
az_set_chain_secret(conn, chain = "cli")Use an access token when another package obtains the token for you:
library(quak)
az_creds <- azr::DefaultCredential$new(
scope = az_default_scope()
)
conn <- az_conn()
az_set_token_secret(conn, token = az_creds$get_token()$access_token)Use a service principal when you want to pass the application credentials directly:
library(quak)
conn <- az_conn()
az_set_sp_secret(
conn,
tenant_id = Sys.getenv("AZURE_TENANT_ID"),
client_id = Sys.getenv("AZURE_CLIENT_ID"),
client_secret = Sys.getenv("AZURE_CLIENT_SECRET")
)All secret helpers also accept
account = "storageaccount" to scope the secret to one
storage account.
Use load_delta() when you want to register a Delta table
in DuckDB and query it with SQL.
library(quak)
conn <- az_conn()
az_set_chain_secret(conn, chain = "cli")
load_delta(
conn = conn,
url = "abfss://container@account/path/sales",
name = "sales",
method = "view"
)
DBI::dbGetQuery(
conn,
"SELECT COUNT(*) AS n, AVG(amount) AS avg_amount
FROM sales
WHERE amount > 100"
)
DBI::dbDisconnect(conn, shutdown = TRUE)Use tbl_delta() when you want to work with a Delta table
through dplyr/dbplyr.
library(quak)
conn <- az_conn()
az_set_chain_secret(conn, chain = "cli")
sales <- tbl_delta(conn, "abfss://container@account/path/sales")
sales |>
dplyr::filter(amount > 100) |>
dplyr::summarise(avg_amount = mean(amount, na.rm = TRUE)) |>
dplyr::collect()
DBI::dbDisconnect(conn, shutdown = TRUE)These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.