Learn R Programming

The goal of bodsr is to allow easy interface between the Bus Open Data Service (BODS) API and R. The BODS dataset provides fares, timetable and vehicle location information about bus services in England. Further details and documentation on the BODS API can be found here.

Installation

You can install the development version of bodsr from GitHub with:

install.packages("devtools")

devtools::install_github("department-for-transport-public/bodsr")

Usage

bodsr has a range of functions designed to make it easy for you to interrogate the BODS API and receive the results as R data objects.

To begin, you will need to create a BODS account and obtain your BODS access token. You can pass this to individual bodsr functions, or save it as an environmental variable called BODS_KEY which the functions will automatically check.

Fare and timetable metadata

The BODS API initially returns metadata about the fare and timetable data held. You can use this metadata to understand the data that is available, as well as locate download links to download full data sets.

The functions get_timetable_metadata() and get_fares_metadata() allow you to return records for timetable and fare metadata respectively. You can filter the records on a number of variables including:

For fares:

  • National Operator Codes
  • Status
  • Bounding box

For timetables:

  • National Operator Codes
  • Status
  • BODS compliance
  • Modified date
  • Admin area
  • Search terms

Check individual function documentation and BODS API help for further details on these variables.

Location data

Granular vehicle-level location data can be extracted from the API in two different formats (more detail of different data formats can be found here):

  • get_location_gtfs(): returns location data in GTFS-RT format
  • get_location_xml(): returns location data in SIRI-VM XML format

As for fare and timetable data, location data can be filtered on a range of parameters including location bounding box, provider, line and vehicle reference.

Timetable data

Once timetable metadata has been returned, this data can be provided to the get_timetable_data() function, which will parse the xml/zip files specified and return the timetable data as a list with one bus line per row and one dataframe per parsed file.

Please note that due to the size of the data files involved, queries using this function can be slow to run and use a large amount of RAM to perform.

Copy Link

Version

Install

install.packages('bodsr')

Monthly Downloads

191

Version

0.1.0

License

MIT + file LICENSE

Maintainer

Francesca Bryden

Last Published

February 11th, 2023

Functions in bodsr (0.1.0)

extract_line_level_data

Open data from a single line metadata table where it's zip or xml format
not_null

Join together a value and an associated API string if the value is not NULL
xml_file_counter

Count the number of xml files included within a provided metadata dataframe, whether the provided file links are xml or zip
not_null_date

Join together a date value and an associated API string if the value is not NULL
open_all_xml

Open every xml file within a zip object and extract data of interest from it using a given function
poss_xml

Try to read an xml file using read_xml; where this fails, quietly return a NULL value
get_location_gtfs

Return GTFS-RT location data from the 'BODS' API
get_location_xml

Return XML vehicle location data from the 'BODS' API
noc_lookup

Lookup between operator names and national operator code ('NOC') lookup
get_timetable_data

Extract line-level timetable data from all rows of the provided metadata table
line_level_xml

Pull a table of relevant values from specified nodes in the xml
get_fares_metadata

Return fares metadata from the 'BODS' API
find_node_value

Search an xml file for a specific named mode and return the value(s) stored in it
get_timetable_metadata

Return timetable metadata from the 'BODS' API
count_nodes

Search an xml file for a specific named node and count the number of instances