The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Quick start guide to pixieweb

pixieweb makes it easy to download open statistical data from PX-Web APIs — the platform used by Statistics Sweden (SCB), Statistics Norway (SSB), Statistics Finland, and many others. This vignette walks you from zero to a tidy tibble in five steps.

Step 1: Connect to an API

library(pixieweb)

scb <- px_api("scb", lang = "en")
scb

px_api() accepts a short alias ("scb", "ssb", "statfi") or a full URL. Use px_api_catalogue() to list known instances.

Step 2: Find a table

PX-Web organises data into tables. Each table holds a data cube with one or more dimensions (called variables). Use get_tables() to search:

tables <- get_tables(scb, query = "population")
tables

The result is a tibble. You can narrow it further on the client side with table_search(), and inspect tables with table_describe():

tables |>
  table_search("municipal") |>
  table_describe(max_n = 3, format = "md")

table_describe() now shows the subject path, time period range, and data source alongside the title — making it much easier to pick the right table.

Step 3: Explore variables

Once you have a table ID, inspect what variables (dimensions) it has:

vars <- get_variables(scb, "TAB683")
vars |> variable_describe()

Each variable has a set of available values (codes). Look at a specific variable’s values:

vars |> variable_values("Region")

Step 4: Fetch data

Now you know which variables the table has and what values are available. Pass your selections to get_data():

pop <- get_data(scb, "TAB638",
  Region = c("0180", "1480"),
  ContentsCode = "*",
  Tid = px_top(5)
)
pop

Selection helpers like px_top(), px_from(), and px_range() let you select values without knowing exact codes. Use them when you want “the latest N periods” or “everything from 2020 onward” rather than typing out specific year codes.

Optional shortcut: prepare_query()

You can skip this section if you prefer the direct approach above. prepare_query() inspects the table and fills in sensible defaults — handy when you don’t want to specify every variable:

q <- prepare_query(scb, "TAB638", Region = c("0180", "1480"))

It prints a summary of what was chosen and why. When you’re happy, pass the query to get_data():

pop <- get_data(scb, query = q)

Set maximize_selection = TRUE to automatically include as many variables as the API’s cell limit allows:

q <- prepare_query(scb, "TAB638",
  Region = c("0180"),
  maximize_selection = TRUE
)

Step 5: Work with the result

The result is a standard tibble. Use your favourite tidyverse tools:

library(ggplot2)

pop |>
  ggplot(aes(x = Tid, y = value, colour = Region_text)) +
  # One line per region
  geom_line(aes(group = Region_text)) +
  # Separate panel for each measure (Population, Deaths, etc.)
  facet_wrap(~ ContentsCode_text, scales = "free_y") +
  # Rotate x-axis labels to avoid overlap
  theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1)) +
  labs(
    title = "Population over time",
    caption = px_cite(pop)  # Auto-generated data citation
  )

Notice the _text suffix: get_data() returns both raw code columns (Region = "0180") and human-readable label columns (Region_text = "Stockholm"). Use _text columns for display and plotting; use the raw codes for filtering and joining.

Other useful helpers:

Next steps

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.