The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

bluebike: A Data Package for Bluebike users

Ziyue Yang and Tianshu Zhang

2022-05-04

# needed packages in vignette
library(bluebike)
library(dplyr)
library(leaflet)

Summary

Our package includes data from the Boston Blue Bike trip history data acquired from the Blue Bikes System Data. The users can import all monthly trip history data from 2020 to 2022 into a cleaned data set that can easily be used for data analysis.

The package also includes a sample data set that includes 1000 sampled trip history from Feb. 2022, and a full data set that contains information about all available stations. Functions inside the package:

The package would be a useful tool for the Blue Bike operations to analyze the trip data and help improve the shared bike service based on user data. It is also an easy-to-use tool for data analysis and visualization for anyone interested in the Blue Bike trip data. ## Data Sets Included

Basic Usage

library(bluebike)
library(dplyr)

Retrieve data online

import_month_data enables users to retrieve monthly data from Bluebike System Data website.

jan2015 <- import_month_data(2015, 1)
#> Rows: 7840 Columns: 15
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr  (4): start station name, end station name, usertype, birth year
#> dbl  (9): tripduration, start station id, start station latitude, start stat...
#> dttm (2): starttime, stoptime
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Data Wrangling

stations <- trip_history_sample %>% 
  group_by(start_station_name) %>% 
  summarize(trips_from = n())
head(stations)
#> # A tibble: 6 × 2
#>   start_station_name                        trips_from
#>   <chr>                                          <int>
#> 1 175 N Harvard St                                   8
#> 2 191 Beacon St                                      3
#> 3 30 Dane St                                         7
#> 4 359 Broadway - Broadway at Fayette Street          4
#> 5 606 American Legion Hwy at Canterbury St           1
#> 6 699 Mt Auburn St                                   5
jan_distance <- jan2015 %>% 
  sample_n(1000) %>% 
  trip_distance()
mean_jan_distance <- mean(jan_distance$distance)

mean_jan_distance
#> 3215.401 [m]
top_5_station <- station_distance(-71.13, 42.36) %>%
  head(5)

top_5_station
#>         distance station_ID                                   station_name
#> 210 124.9942 [m]     A32040                                  Honan Library
#> 3   427.6489 [m]     A32019                               175 N Harvard St
#> 221 606.1752 [m]     A32011 Innovation Lab - 125 Western Ave at Batten Way
#> 74  660.5163 [m]     A32005               Brighton Mills - 370 Western Ave
#> 380 954.2026 [m]     A32001    Union Square - Brighton Ave at Cambridge St
#>               station_position docks
#> 210 POINT (-71.12852 42.36027)    15
#> 3    POINT (-71.12916 42.3638)    18
#> 221  POINT (-71.1246 42.36371)    19
#> 74  POINT (-71.13776 42.36155)    15
#> 380 POINT (-71.13731 42.35333)    19

Data Visualization via Leaflet

library(leaflet)

BostonMap <- leaflet(data = station_data) %>% 
  addTiles() %>% 
  addCircleMarkers(lng = station_data$longitude, 
                   lat = station_data$latitude, 
                   radius = 0.1, 
                   color = "blue")

BostonMap
station_500 <- station_radius(-71.13, 42.36, r = 500)

station_500

Contributors

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.