Learn R Programming

bisg (version 0.1.0)

compute_p_r_cond_s_g: Computes the probability a person is of a specific racial group, conditioned on surname and geolocation.

Description

This is a utility function for performing BISG. It operates on a voter file, and counts obtained from the Census Bureau via eiCompare's helper function.

Usage

compute_p_r_cond_s_g(
  voter_file,
  geo_counts,
  surname_col,
  geo_col,
  surname_counts = NULL,
  race_cols = c("whi", "bla", "his", "asi", "oth"),
  geo_col_counts = "fips",
  surname_col_counts = "surname"
)

Value

A tibble with rows denoting voters and columns denoting the probability that each voter is of a particular racial group.

Arguments

voter_file

A tibble containing a list of voters (by row), and a column that denotes their surname.

geo_counts

A tibble containing counts (divided amongst constituent groups) per geographic units (rows).

surname_col

A string denoting which column contains the voter surname.

geo_col

A string denoting which column contains the geographic unit ID.

surname_counts

A dataframe denoting the frequency with which surnames correspond to different race/ethnicities. If NULL, the Census surname list is used with categories and merging functions from wru. The dataframe should contain one column with surnames (specified with the y surname_col_counts parameter) and one column for each race/ethnicity group (specified with the race_cols parameter).

race_cols

A list of strings denoting the columns containing racial groups.

geo_col_counts

A string denoting the column in the geo_counts tibble that refers to the geographic unit.

surname_col_counts

A string denoting the column in the surname_counts tibble that refers to the geographic unit.