fpr_parity: False Positive Rate parity

Description

This function computes the False Positive Rate (FPR) parity metric

Usage

fpr_parity(
  data,
  outcome,
  group,
  probs = NULL,
  preds = NULL,
  preds_levels = NULL,
  outcome_base = NULL,
  cutoff = 0.5,
  base = NULL,
  group_breaks = NULL
)

Arguments

data

The dataframe that contains the necessary columns.

outcome

The column name of the actual outcomes.

group

Sensitive group to examine.

probs

The column name or vector of the predicted probabilities (numeric between 0 - 1). If not defined, argument preds needs to be defined.

preds

The column name or vector of the predicted binary outcome (0 or 1). If not defined, argument probs needs to be defined.

preds_levels

The desired levels of the predicted binary outcome. If not defined, levels of the outcome variable are used.

outcome_base

Base level for the target variable used to compute fairness metrics. Default is the first level of the outcome variable.

cutoff

Cutoff to generate predicted outcomes from predicted probabilities. Default set to 0.5.

base

Base level for sensitive group comparison.

group_breaks

If group is continuous (e.g., age): either a numeric vector of two or more unique cut points or a single number >= 2 giving the number of intervals into which group feature is to be cut.

Value

Metric

Raw false positive rates for all groups and metrics standardized for the base group (false positive rate parity metric). Lower values compared to the reference group mean lower false positive error rates in the selected subgroups

Metric_plot

Bar plot of False Positives Rate metric

Probability_plot

Density plot of predicted probabilities per subgroup. Only plotted if probabilities are defined

Details

This function computes the False Positive Rate (FPR) parity metric as described by Chouldechova 2017. False positive rates are calculated by the division of false positives with all negatives (irrespective of predicted values). In the returned named vector, the reference group will be assigned 1, while all other groups will be assigned values according to whether their false positive rates are lower or higher compared to the reference group. Lower false positives error rates will be reflected in numbers lower than 1 in the returned named vector, thus numbers lower than 1 mean BETTER prediction for the subgroup.

Examples

Run this code

# NOT RUN {
data(compas)
fpr_parity(data = compas, outcome = 'Two_yr_Recidivism', group = 'ethnicity',
probs = 'probability', preds = NULL, preds_levels = c('no', 'yes'),
cutoff = 0.4, base = 'Caucasian')
fpr_parity(data = compas, outcome = 'Two_yr_Recidivism', group = 'ethnicity',
probs = NULL, preds = 'predicted', preds_levels = c('no', 'yes'),
cutoff = 0.5, base = 'Hispanic')

# }

Run the code above in your browser using DataLab