Learn R Programming

casimir (version 0.3.3)

compute_propensity_scores: Compute inverse propensity scores

Description

Compute inverse propensity scores based on a label distribution. Propensity scores for extreme multi-label learning are proposed in Jain, H., Prabhu, Y., & Varma, M. (2016). Extreme Multi-label Loss Functions for Recommendation, Tagging, Ranking and Other Missing Label Applications. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 13-17-Aug, 935–944. tools:::Rd_expr_doi("10.1145/2939672.2939756").

Usage

compute_propensity_scores(label_distribution, a = 0.55, b = 1.5)

Value

A data.frame with columns "label_id", "label_weight".

Arguments

label_distribution

Expects a data.frame with columns "label_id", "label_freq", "n_docs". label_freq corresponds to the number of occurences a label has in the gold standard. n_docs corresponds to the total number of documents in the gold standard.

a

A numeric parameter for the propensity score calculation, defaults to 0.55.

b

A numeric parameter for the propensity score calculation, defaults to 1.5.

Examples

Run this code

library(tidyverse)
library(casimir)

label_distribution <- dnb_label_distribution

compute_propensity_scores(label_distribution)

Run the code above in your browser using DataLab