The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

diseasystore: Google Health COVID-19 Open Data

library(diseasystore)

The Google COVID-19 data repository is a comprehensive open repository of COVID-19 data.

This vignette shows how to use (some of) this data through the diseasystore package.

First, it is a good idea to copy the relevant Google COVID-19 data files locally and store that location as an option for the package. DiseasystoreGoogleCovid19 uses only the age-stratified metrics for COVID-19, so only a subset of the repository is needed to download.

# First we set the path we want to use as an option
options(
  "diseasystore.DiseasystoreGoogleCovid19.source_conn" =
    file.path("local", "path")
)

# Ensure folder exists
source_conn <- diseasyoption("source_conn", "DiseasystoreGoogleCovid19")
if (!dir.exists(source_conn)) {
  dir.create(source_conn, recursive = TRUE, showWarnings = FALSE)
}

# Define the Google files to download
google_files <- c("by-age.csv", "demographics.csv", "index.csv", "weather.csv")

# Download each file and compress them to reduce storage
purrr::walk(google_files, ~ {
  url <- paste0(diseasyoption("remote_conn", "DiseasystoreGoogleCovid19"), .)

  destfile <- file.path(
    diseasyoption("source_conn", "DiseasystoreGoogleCovid19"),
    .
  )

  if (!file.exists(destfile)) {
    download.file(url, destfile)
  }
})

The diseasystores require a database to store its features in. These should be configured before use and can be stored in the packages options.

# We define target_conn as a function that opens a DBIconnection to the DB
target_conn <- \() DBI::dbConnect(RSQLite::SQLite())
options(
  "diseasystore.DiseasystoreGoogleCovid19.target_conn" = target_conn
)

Once the files are downloaded and the target DB is configured, we can initialize the diseasystore that uses the Google COVID-19 data.

ds <- DiseasystoreGoogleCovid19$new()

Once configured such, we can use the feature store directly to get data.

# We can see all the available features in the feature store
ds$available_features
#>  [1] "n_population"    "age_group"       "country_id"      "country"        
#>  [5] "region_id"       "region"          "subregion_id"    "subregion"      
#>  [9] "n_hospital"      "n_deaths"        "n_positive"      "n_icu"          
#> [13] "n_ventilator"    "min_temperature" "max_temperature"
# And then retrieve a feature from the feature store
ds$get_feature(feature = "n_hospital",
               start_date = as.Date("2020-01-01"),
               end_date = as.Date("2020-06-01"))
#> # Source:   table<`dbplyr_BZD6Hfrgej`> [?? x 5]
#> # Database: sqlite 3.45.2 []
#>   key_location key_age_bin n_hospital valid_from valid_until
#>   <chr>        <chr>            <dbl>      <dbl>       <dbl>
#> 1 AR           0                   NA      18262       18263
#> 2 AR           0                   NA      18263       18264
#> 3 AR           0                   NA      18264       18265
#> 4 AR           0                   NA      18265       18266
#> 5 AR           0                   NA      18266       18267
#> # ℹ more rows

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.