GTFS Tables

2019-03-04

GTFS Table Relationships

gtfs-relationship-diagram Source: Wikimedia, user -stk.

Additional tables calculated by tidytransit

In addition to the tables described above, tidytransit attempts to calculate the following tables when one uses read_gtfs():

Simple Features

Frequencies/Headways

tidytransit prints a message regarding these tables on reading any GTFS file.

Example GTFS Table Joins

Route Frequencies to Routes

For example, joining the standard routes table, with the ‘route_shortname’ variable to routes_frequencies.

You can do the same with ‘simple features tables’.

For example, under the hood, plot(gtfs_obj) is doing this:

Headways at Stops for a Route

A more complex example of cross-table joins is to pull the stops and their headways for a given route.

This simple question is a great way to begin to understand a lot about the GTFS data model.

First, we’ll need to find a ‘service_id’, which will tell us which stops a route passes through on a given day of the week and year.

When calculating frequencies, tidytransit tries to guess which service_id is representative of a standard weekday by walking through a set of steps. Below we’ll just do some of this manually.

First, lets look at the calendar.

Then we’ll pull a service_id for the C train on mondays.

Now we’ll filter down through the data model to just stops for that route and service_ids.

Before we plot them, lets pull the frequency calculations from the calculated table onto their geometries.

Due to the way that schedules

We may–in fact, we probably will–see some surprising outliers for headway calculations in this plot.

Calculating headways at stops is tricky for a number of reasons.

One of the main reasons is that GTFS wasn’t meant for this kind of analytical work. So the headway calculations in this package aren’t robust against all of the edge cases of every last service and stops that might be listed in a GTFS.

However, I have found that the methods in here are OK at describing transit service headways on routes and stops if you understand that GTFS data can be messy for analytical work.

One quick solution to the outlier stops in above plot is to throw out stops with headways greater than an unreasonable amount of time. For example, we can filter out stops with headways above 100 minutes.

Of course, what solution works for you will depend on what you’re trying to accomplish.