The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
hydrocan normalises data from multiple Canadian hydrometric networks into one consistent output schema. The mechanism that makes this possible is the adapter: a small object that binds a data source name to a description and a set of fetch functions.
This vignette explains:
An adapter is created with new_hydrocan_adapter():
new_hydrocan_adapter(
name,
description,
list_stations_fn,
fetch_flows_fn = NULL,
fetch_daily_flows_fn = NULL,
fetch_levels_fn = NULL,
fetch_daily_levels_fn = NULL,
list_stations_meta_fn = NULL,
license = NULL,
license_url = NULL,
terms_url = NULL
)| Argument | Type | Contract |
|---|---|---|
name |
single character | Unique identifier; becomes the provider_name column in
all output and the registry key |
description |
single character | Human-readable description of the source and its limitations; shown
by hc_list_sources() |
list_stations_fn |
function() |
No arguments; returns a character vector of station IDs this adapter can serve |
fetch_flows_fn |
function(station_id, start_date, end_date) or
NULL |
Returns a tibble matching the realtime schema; NULL if
sub-daily flow data is not available |
fetch_daily_flows_fn |
function(station_id, start_date, end_date) or
NULL |
Returns a tibble matching the daily schema; NULL if
daily flow data is not available |
fetch_levels_fn |
function(station_id, start_date, end_date) or
NULL |
Returns a tibble matching the realtime schema with
parameter = "water_level"; NULL if sub-daily
level data is not available |
fetch_daily_levels_fn |
function(station_id, start_date, end_date) or
NULL |
Returns a tibble matching the daily schema with
parameter = "water_level"; NULL if daily level
data is not available |
list_stations_meta_fn |
function() or NULL |
No arguments; returns a tibble matching the stations schema;
NULL if station metadata is not available |
license |
single character or NULL |
Optional license name (e.g. "CC-BY 4.0"); exposed by
hc_list_sources() |
license_url |
single character or NULL |
Optional URL to the license text |
terms_url |
single character or NULL |
Optional URL to the data provider’s terms of use |
At least one fetch function must be non-NULL.
fetch_flows_fn /
fetch_levels_fn| Column | Type | Notes |
|---|---|---|
station_id |
chr | As provided by the caller |
timestamp |
POSIXct UTC | Sub-daily observations |
value |
dbl | |
parameter |
chr | "water_discharge" or "water_level" |
unit |
chr | Canonical form after normalization (e.g. "m3/s",
"m") |
provider_name |
chr | Must equal the adapter name |
quality_code |
chr | Raw provider quality code; NA if unavailable |
qf_desc |
chr | Provider description of the quality code; NA if
unavailable |
fetch_daily_flows_fn /
fetch_daily_levels_fnSame as the realtime schema above, but with date (Date)
in place of timestamp (POSIXct).
list_stations_meta_fn| Column | Type | Notes |
|---|---|---|
station_id |
chr | |
station_name |
chr | |
provider_name |
chr | Must equal the adapter name |
longitude |
dbl | |
latitude |
dbl | |
elevation_m |
dbl | NA if unavailable |
period_start |
Date | NA if unavailable |
period_end |
Date | NA if station is still active |
notes |
list | Adapter-specific metadata; NULL per row if unused |
When you call hc_read_flows(), the router:
list_stations_fn() on every registered
adapter.source = explicitly. Station IDs must be unambiguous
across the registry.tryCatch so a failure for one station does not abort the
whole request.dplyr::bind_rows().Passing source = "adaptername" restricts the router to
that adapter, but it still calls list_stations_fn() for
that adapter and checks that the requested station is present before
fetching data.
hc_list_sources() returns a tibble of all registered
adapters with their descriptions and a logical column per data type
indicating what each adapter supports. hc_read_stations()
queries all adapters for station metadata, skipping those that do not
implement list_stations_meta_fn.
hydroquebec)The hydroquebec adapter wraps the Hydro-Quebec
open data portal, which provides flow measurements at Hydro-Quebec
reservoir facilities via an Opendatasoft REST API. No authentication is
required.
Key characteristics:
"3-230".parameter = "water_discharge"); no water level.approval column is NA for all records
(the source does not publish approval status); quality_flag
carries the source’s point type field.Station listing and data access:
library(hydrocan)
# Sub-daily (hourly) flows
flows <- hc_read_flows(
station_id = "3-230",
start_date = Sys.Date() - 5,
end_date = Sys.Date(),
source = "hydroquebec"
)
# Source-native daily flows
daily <- hc_read_daily_flows(
station_id = "3-230",
start_date = Sys.Date() - 5,
end_date = Sys.Date(),
source = "hydroquebec"
)The adapter pages through the API (100 records per request) and
filters the returned records to the requested date range in R, because
the API stores split_date as a text field rather than a
datetime field.
Source code: R/hydroquebec.R.
Registered via:
hydrocan_adapter_hydroquebec <- function() {
new_hydrocan_adapter(
"hydroquebec",
paste(
"Hydro-Quebec open data (Opendatasoft platform).",
"Flow data only; no water level.",
"Rolling window of approximately 10 days - historical data is not available."
),
.hq_list_stations,
fetch_flows_fn = .hq_fetch_flows,
fetch_daily_flows_fn = .hq_fetch_daily_flows,
list_stations_meta_fn = .hq_list_stations_meta
)
}Adapters are registered at load time in
R/hydrocan-package.R. Use hc_list_sources() to
see all currently registered sources and which data types each
supports.
Suppose you want to add a hypothetical provincial network called “MyProv” that exposes a JSON API. The steps are:
Create R/myprov.R:
.MYPROV_URL <- "https://data.myprov.ca/api/hydro"
.myprov_list_stations <- function() {
resp <- httr2::request(.MYPROV_URL) |>
httr2::req_url_query(endpoint = "stations", format = "json") |>
httr2::req_perform() |>
httr2::resp_body_json(simplifyVector = TRUE)
resp$station_id # character vector
}
.myprov_fetch_flows <- function(station_id, start_date, end_date) {
resp <- httr2::request(.MYPROV_URL) |>
httr2::req_url_query(
endpoint = "timeseries",
station = station_id,
from = format(start_date),
to = format(end_date),
format = "json"
) |>
httr2::req_perform() |>
httr2::resp_body_json(simplifyVector = TRUE)
tibble::tibble(
station_id = station_id,
timestamp = as.POSIXct(resp$timestamp, tz = "UTC"),
value = as.numeric(resp$discharge_cms),
parameter = "water_discharge",
unit = "m3/s",
provider_name = "myprov",
quality_code = resp$quality_code,
qf_desc = NA_character_
)
}
hydrocan_adapter_myprov <- function() {
new_hydrocan_adapter(
"myprov",
"MyProv provincial hydrometric network. Sub-daily flows only.",
.myprov_list_stations,
fetch_flows_fn = .myprov_fetch_flows
)
}If your source also provides daily data, levels, or station metadata,
supply the corresponding optional function arguments. Only the
capabilities you implement will be advertised by
hc_list_sources().
Some sources do not expose a station-listing endpoint. In those
cases, bundle a character vector of known station IDs directly in the
package and return it from list_stations_fn:
.MYPROV_STATIONS <- c("MP001", "MP002", "MP003")
.myprov_list_stations <- function() .MYPROV_STATIONSThe tradeoff is that the list must be maintained manually as the
network changes. The router only requires that
list_stations_fn() return a character vector; how that
vector is produced is left entirely to the adapter.
Add one line to the .onLoad block in
R/hydrocan-package.R:
Tests for adapters are written against a mock adapter rather than
hitting the live network. This keeps the test suite fast and fully
offline. The pattern, established in
tests/testthat/helper-mocks.R, is:
list_stations_fn that returns a hardcoded
character vector.new_hydrocan_adapter().local_register_adapter(), which restores the prior registry
state on exit..myprov_stations <- c("MP001", "MP002")
.myprov_mock_fetch_flows <- function(station_id, start_date, end_date) {
dates <- seq(as.Date(start_date), as.Date(end_date), by = "day")
tibble::tibble(
station_id = station_id,
timestamp = as.POSIXct(dates, tz = "UTC"),
value = seq_along(dates) * 1.0,
parameter = "water_discharge",
unit = "m3/s",
provider_name = "myprov",
quality_code = NA_character_,
qf_desc = NA_character_
)
}
mock_myprov_adapter <- new_hydrocan_adapter(
"myprov",
"Mock MyProv adapter for offline testing.",
function() .myprov_stations,
fetch_flows_fn = .myprov_mock_fetch_flows
)
test_that("myprov adapter returns correct schema", {
local_register_adapter(mock_myprov_adapter)
result <- hc_read_flows(
station_id = "MP001",
start_date = "2024-01-01",
end_date = "2024-01-03",
source = "myprov"
)
expect_s3_class(result, "hydrocan_realtime")
expect_equal(nrow(result), 3L)
})local_register_adapter() and
local_clear_registry() are defined in
tests/testthat/helper-mocks.R and are available to all test
files automatically.
validate_hydrocan_schema() is called automatically after
every data-fetching API call (hc_read_flows(),
hc_read_daily_flows(), hc_read_levels(),
hc_read_daily_levels()). It will stop with a clear message
if:
It also normalises the unit column: common variants such
as "m³/s", "cms", or "m^3/s" are
all mapped to the canonical "m3/s". Unrecognised unit
strings pass through unchanged with a warning, identifying the raw
string so it can be added to the mapping table in
R/schema.R.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.