compute_p_r_cond_s_g: Computes the probability a person is of a specific racial group, conditioned on surname and geolocation.

Description

This is a utility function for performing BISG. It operates on a voter file, and counts obtained from the Census Bureau via eiCompare's helper function.

Usage

compute_p_r_cond_s_g(
  voter_file,
  geo_counts,
  surname_col,
  geo_col,
  surname_counts = NULL,
  race_cols = c("whi", "bla", "his", "asi", "oth"),
  geo_col_counts = "fips",
  surname_col_counts = "surname"
)

Value

A tibble with rows denoting voters and columns denoting the probability that each voter is of a particular racial group.

Arguments

voter_file: A tibble containing a list of voters (by row), and a column that denotes their surname.
geo_counts: A tibble containing counts (divided amongst constituent groups) per geographic units (rows).
surname_col: A string denoting which column contains the voter surname.
geo_col: A string denoting which column contains the geographic unit ID.
surname_counts: A dataframe denoting the frequency with which surnames correspond to different race/ethnicities. If NULL, the Census surname list is used with categories and merging functions from wru. The dataframe should contain one column with surnames (specified with the y surname_col_counts parameter) and one column for each race/ethnicity group (specified with the race_cols parameter).
race_cols: A list of strings denoting the columns containing racial groups.
geo_col_counts: A string denoting the column in the geo_counts tibble that refers to the geographic unit.
surname_col_counts: A string denoting the column in the surname_counts tibble that refers to the geographic unit.