You can install CDECRetrieve the usual way,
The goal for CDECRetrieve is to create a workflow for R users using CDEC data, we believe that a well defined workflow is easier to automate and less prone to error (or easier to catch errors). In order to do this we create “services” out of different endpoints available through the CDEC site. A lot ideas in developing the package came from using dataRetrieval
from USGS and the NOAA CDO api.
We start by first exploring locations of interest. The CDEC site provides a web form with a lot of options,
cdec station search
The pakcage exposes this functionallity through cdec_stations()
. Although it doesn’t (currently) map all options in the web form it does so for the most used, namely, station id, nearby city, river basin, hydro area and county. At least one of the parameters must be supplied, and combination of these can be supplied to refine the search.
library(CDECRetrieve)
cdec_stations(station_id = "kwk") # return metadata for KWK
#> # A tibble: 1 x 9
#> station_id name river_basin county longitude latitude elevation
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <chr>
#> 1 kwk sacramento r… sacramento… shasta -122. 40.6 596  
#> # ... with 2 more variables: operator <chr>, state <chr>
# show all locations near san francisco, this returns a set of
# CDEC station that are near San Francisco
cdec_stations(nearby_city = "san francisco")
#> # A tibble: 3 x 9
#> station_id name river_basin county longitude latitude elevation
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <chr>
#> 1 sfn san franci… sf bay san fra… -122. 37.8 150  
#> 2 cx2 daily x2 c… sf bay san fra… -122. 37.8 0  
#> 3 ggt golden gate sf bay san fra… -122. 37.8 0  
#> # ... with 2 more variables: operator <chr>, state <chr>
# show all location in the sf bay river basin
cdec_stations(river_basin = "sf bay")
#> # A tibble: 25 x 9
#> station_id name river_basin county longitude latitude elevation
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <chr>
#> 1 mrh marsh cree… sf bay contra… -122. 37.9 740  
#> 2 sfn san franci… sf bay san fr… -122. 37.8 150  
#> 3 lsm los medonos sf bay contra… -122. 38.0 130  
#> 4 dvb danville l… sf bay contra… -122. 37.8 364  
#> 5 snn san andreas sf bay san ma… -122. 37.6 456  
#> 6 umn mt. umunhu… sf bay santa … -122. 37.2 3090 &nb…
#> 7 rhl richmond c… sf bay contra… -122. 37.9 55  
#> 8 mas main ave d… sf bay santa … -1000. 100.0 9999
#> 9 cx2 daily x2 c… sf bay san fr… -122. 37.8 0  
#> 10 ccp concord pa… sf bay contra… -122. 38.0 558  
#> # ... with 15 more rows, and 2 more variables: operator <chr>, state <chr>
# show all station in Tehama county
cdec_stations(county = "tehama")
#> # A tibble: 45 x 9
#> station_id name river_basin county longitude latitude elevation
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <chr>
#> 1 teh sacramento … sacramento… tehama -122. 40.0 213  
#> 2 crg corning air… sacramento… tehama -122. 39.9 294  
#> 3 sbb sacramento … sacto vly … tehama -122. 40.3 186  
#> 4 btw nf battle c… sacramento… tehama -122. 40.4 1200 &nb…
#> 5 lgs log spring sacto vly … tehama -123. 39.8 5100 &nb…
#> 6 atp anthony peak stony cr tehama -123. 39.8 6200 &nb…
#> 7 bnd sacramento … sacramento… tehama -122. 40.3 286  
#> 8 dvr davis ranch cottonwood… tehama -122. 40.4 550  
#> 9 sh1 sheet iron … sacramento… tehama -123. 39.5 6500 &nb…
#> 10 bas south fork … sacramento… tehama -122. 40.4 997  
#> # ... with 35 more rows, and 2 more variables: operator <chr>, state <chr>
Since we are simply exploring for locations of interest, it may be useful to map these for visual inspection. CDECRetrieve provides a simple function to do exactly this map_stations()
.
The same can be done with leaflet functions
After exploring stations in a desired location. We can start focusing on the datasets available at the locations.
station <- "sha"
cdec_datasets("sha")
#> # A tibble: 21 x 6
#> sensor_number sensor_name sensor_units duration start end
#> <int> <chr> <chr> <chr> <date> <date>
#> 1 2 precipitatio… inches daily 2003-10-01 2018-05-15
#> 2 2 precipitatio… inches monthly 1953-10-01 2018-05-15
#> 3 6 reservoir el… feet daily 1985-01-01 2018-05-15
#> 4 6 reservoir el… feet hourly 1993-12-09 2018-05-15
#> 5 8 full natural… cfs daily 1987-05-31 2018-05-15
#> 6 15 reservoir st… af daily 1985-01-01 2018-05-15
#> 7 15 reservoir st… af hourly 1994-06-24 2018-05-15
#> 8 15 reservoir st… af monthly 1953-10-01 2018-05-15
#> 9 22 reservoir st… af daily 1993-10-03 2018-05-15
#> 10 23 reservoir ou… cfs daily 1987-01-05 2018-05-15
#> # ... with 11 more rows
Since all of these functions return a tidy dataframe we can make use of the dplyr
to filter, mutate and explore. Here we look for datasets in Shasta that report a storage
library(magrittr)
cdec_datasets("sha") %>%
dplyr::filter(grepl("storage", sensor_name))
#> # A tibble: 5 x 6
#> sensor_number sensor_name sensor_units duration start end
#> <int> <chr> <chr> <chr> <date> <date>
#> 1 15 reservoir sto… af daily 1985-01-01 2018-05-15
#> 2 15 reservoir sto… af hourly 1994-06-24 2018-05-15
#> 3 15 reservoir sto… af monthly 1953-10-01 2018-05-15
#> 4 22 reservoir sto… af daily 1993-10-03 2018-05-15
#> 5 94 reservoir top… af daily 2000-10-24 2018-05-15
Take note of the sensor number, and duration, these will be needed for querying data in the next section.
Now that we have a location, parameter of interest and duration we can start to query for actual data.
sha_storage_daily <- cdec_query(station = "sha", sensor_num = "15",
dur_code = "d", start_date = "2018-01-01",
end_date = Sys.Date())
sha_storage_daily
#> # A tibble: 135 x 5
#> agency_cd location_id datetime parameter_cd parameter_value
#> <chr> <chr> <dttm> <chr> <dbl>
#> 1 CDEC SHA 2018-01-01 00:00:00 15 3203249.
#> 2 CDEC SHA 2018-01-02 00:00:00 15 3202064.
#> 3 CDEC SHA 2018-01-03 00:00:00 15 3203723.
#> 4 CDEC SHA 2018-01-04 00:00:00 15 3206566.
#> 5 CDEC SHA 2018-01-05 00:00:00 15 3210358.
#> 6 CDEC SHA 2018-01-06 00:00:00 15 3215097.
#> 7 CDEC SHA 2018-01-07 00:00:00 15 3217003.
#> 8 CDEC SHA 2018-01-08 00:00:00 15 3229391.
#> 9 CDEC SHA 2018-01-09 00:00:00 15 3237014.
#> 10 CDEC SHA 2018-01-10 00:00:00 15 3242032.
#> # ... with 125 more rows
Once again the the data is in a tidy form.
We can plot with ggplot2