Working with multiple APIs

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

One of pixieweb’s strengths is its ability to connect to any PX-Web instance with the same interface. This vignette shows how to compare data across national statistics agencies.

The honest truth about cross-country comparison: The pixieweb functions work identically across APIs, but the data is not harmonised. Table IDs, variable names, and code systems differ between countries. The workflow is: find a comparable table in each country (the hard part), then use identical pixieweb code to fetch and combine the results.

Available APIs

library(pixieweb)

px_api_catalogue()

pixieweb ships with a catalogue of known PX-Web instances. You can also connect to any PX-Web API by providing a full URL.

Connecting to multiple agencies

scb <- px_api("scb", lang = "en")      # Sweden (v2)
ssb <- px_api("ssb", lang = "en")      # Norway (v2)
statfi <- px_api("statfi", lang = "en") # Finland (v1)

Each API object stores the base URL, language, API version, and configuration (cell limits, rate limits):

scb
ssb

API version differences

PX-Web has two API versions: - v1: Legacy, POST-only data queries, no search endpoint. Table discovery requires walking a folder hierarchy. - v2: Modern, GET+POST data queries, full-text search, codelists endpoint, saved queries.

pixieweb handles both versions transparently. The user-facing functions have the same signatures — only the internal request building differs.

Some selection helpers are v2-only: px_bottom(), px_from(), px_to(), and px_range() will raise an informative error if used against a v1 API.

Cross-country comparison example

Suppose you want to compare population data across Sweden and Norway. The table IDs and variable codes will differ, but the workflow is identical:

library(dplyr)
library(purrr)

# Find population tables in each country
scb_tables <- get_tables(scb, query = "population")
ssb_tables <- get_tables(ssb, query = "population")

# Explore a table from each
scb_tables |> table_describe(max_n = 3)
ssb_tables |> table_describe(max_n = 3)

Note that table IDs are completely different between countries, and variable names may also differ (“Region” in SCB vs other names elsewhere). Always run variable_describe() on each table before building your query:

# Fetch data using prepare_query() for quick exploration
scb_q <- prepare_query(scb, "TAB638",
  Region = "00",       # "Riket" (whole country)
  Tid = px_top(5),
  ContentsCode = "BE0101N1" # Population
)

# Norwegian table IDs are different — explore to find the right one
ssb_vars <- get_variables(ssb, "05803")
ssb_vars |> variable_describe()

Combining results

Since get_data() returns standard tibbles with a table_id column, you can bind results from different APIs:

results <- list(
  sweden = get_data(scb, query = scb_q),
  norway = get_data(ssb, "05803",
    ContentsCode = "Personer",
    Tid = px_top(5)
  )
)

# .id = "country" adds a column tracking which list element each row
# came from — essential for traceability after binding
bind_rows(results, .id = "country")
# NOTE: column names may differ between countries. If so, you may need
# to rename() before bind_rows() to align them.

Tips for cross-agency work

Language matters. Codes are often language-dependent. lang = "en" gives the most consistent labels across countries, but codes and table IDs are language-independent.
Table structure varies. Swedish tables may have “Region” while Finnish tables have “Alue”. Run get_variables() |> variable_describe() on each table before writing queries.
API limits differ. SCB allows ~100 000 cells per request; other agencies may allow less. Use api$config$max_cells to check. prepare_query() respects the limit automatically.
v1 vs v2. Not all agencies have migrated to v2. Selection helpers px_from(), px_range() etc. raise an informative error if used against a v1 API. Check api$version and the catalogue’s versions column.

Next steps

Data model & advanced features — vignette("introduction-to-pixieweb") covers codelists, wide output, and query composition.
Quick refresher — vignette("a-quickstart") for the single-API basics.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.