Learn R Programming

rfars

The goal of rfars is to facilitate transportation safety analysis by simplifying the process of extracting data from official crash databases. The National Highway Traffic Safety Administration collects and publishes a census of fatal crashes in the Fatality Analysis Reporting System and a sample of fatal and non-fatal crashes in the Crash Report Sampling System (an evolution of the General Estimates System). The Fatality and Injury Reporting System Tool allows users to query these databases, and can produce simple tables and graphs. This suffices for simple analysis, but often leaves researchers wanting more. Digging any deeper, however, involves a time-consuming process of downloading annual ZIP files and attempting to stitch them together - after first combing through immense data dictionaries to determine the required variables and table names.

rfars allows users to download the last 10 years of FARS and GES/CRSS data with just one line of code. The result is a full, rich dataset ready for mapping, modeling, and other downstream analysis. Codebooks with variable definitions and value labels support an informed analysis of the data (see vignette("Searchable Codebooks", package = "rfars") for more information). Helper functions are also provided to produce common counts and comparisons.

Installation

You can install the latest version of rfars from GitHub with:

# install.packages("devtools")
devtools::install_github("s87jackson/rfars")

or the CRAN stable release with:

install.packages("rfars")

Then load rfars and some helpful packages:

library(rfars)
library(dplyr)

Getting and Using Data

The get_fars() and get_gescrss() are the primary functions of the rfars package. These functions download and process data files directly from NHTSA’s FTP Site, or pull the prepared data stored on your local machine, or (as of Version 2.0) pull the prepared data from Zenodo. The data files hosted on Zenodo are stable, have DOIs, and replicate the data that would be produced by get_fars() and get_gescrss(), but in a fraction of the time.

They take the parameters years and states (FARS) or regions (GES/CRSS). As the source data files follow an annual structure, years determines how many file sets are downloaded or loaded, and states/regions filters the resulting dataset. Downloading and processing these files can take several minutes. Before downloading, rfars will inform you that it’s about to download files and asks your permission to do so. To skip this dialog, set proceed = TRUE. You can use the dir and cache parameters to save an RDS file to your local machine. The dir parameter specifies the directory, and cache names the file (be sure to include the .rds file extension).

Executing the code below will download the prepared FARS and GES/CRSS databases for 2014-2023.

myFARS <- get_fars(proceed = TRUE)
myCRSS <- get_gescrss(proceed = TRUE)

get_fars() and get_gescrss() return a list with six dataframes: flat, multi_acc, multi_veh, multi_per, events, and codebook.

The tables below show records for randomly selected crashes to illustrate the content and structure of the data. The tables are transposed for readability.

Each row in the flat dataframe corresponds to a person involved in a crash. As there may be multiple people and/or vehicles involved in one crash, some variable-values are repeated within a crash or vehicle. Each crash is uniquely identified with id, which is a combination of year and st_case. Note that st_case is not unique across years, for example, st_case 510001 will appear in each year. The id variable attempts to avoid this issue. The GES/CRSS data includes a weight variable that indicates how many crashes each row represents.

Copy Link

Version

Install

install.packages('rfars')

Monthly Downloads

287

Version

2.0.2

License

CC0

Issues

Pull Requests

Stars

Forks

Maintainer

Steve Jackson

Last Published

October 22nd, 2025

Functions in rfars (2.0.2)

parse_sas_format

(Internal) Parse formats.sas instead of using a .sas7bcat file
large_trucks

(Internal) Find crashes involving large trucks
motorcycle

(Internal) Find crashes involving motorcycles
validate_states

(Internal) Validate user-provided list of states
make_id

(Internal) Generate an ID variable
use_imp

(Internal) use_imp
pedalcyclist

(Internal) Find crashes involving pedalcyclists
drugs

(Internal) Find crashes involving drugs
pedestrian

(Internal) Find crashes involving pedestrians
pedbike

(Internal) Find crashes involving pedstrians or bicyclists
use_gescrss

(Internal) Use GESCRSS data files
use_fars

(Internal) Use FARS data files
prep_gescrss

Prepare downloaded GES/CRSS files for use
prep_fars

Prepare downloaded FARS files for use
read_basic_sas

(Internal) Takes care of basic SAS file reading
make_all_numeric

(Internal) Make id and year numeric
%>%

Pipe operator
police_pursuit

(Internal) Find crashes involving police pursuits
speeding

(Internal) Find crashes involving speeding
rollover

(Internal) Find crashes involving rollovers
road_depart

(Internal) Find crashes involving road departures
alcohol

(Internal) Find crashes involving alcohol
distracted_driver

(Internal) Find crashes involving distracted drivers
download_gescrss

(Internal) Download GES/CRSS data files
appendRDS

(Internal) Append RDS files
counts

Generate counts
download_fars

(Internal) Download FARS data files
check_internet_connection

(Internal) Check internet connection
bicyclist

(Internal) Find crashes involving bicyclists
annual_counts

Annual Crash Counts by Risk Factors
compare_counts

Compare counts
driver_age

(Internal) Find crashes involving drivers of a given age
get_sas_attrs

(Internal) Check SAS attributes
get_fars

Get FARS data
get_gescrss

Get GES/CRSS data
hit_and_run

(Internal) Find hit and run crashes
import_multi

(Internal) Import the multi_ files
gescrss_codebook

GESCRSS Codebook
fars_codebook

FARS Codebook
geo_relations

Synonym table for various geographical scales