This function calculates the effect of a query.
calculate_es(x, ...)
effect size
an S3 object returned from a query, either by the function query()
or underlying functions such as mac()
additional parameters for the effect size functions
r
for weat
: a boolean to denote whether convert the effect size to biserial correlation coefficient.
standardize
for weat
: a boolean to denote whether to correct the difference by the standard division. The standardized version can be interpreted the same way as Cohen's d.
The following methods are supported.
mac
mean cosine distance value. The value makes sense only for comparison (e.g. before and after debiasing). But a lower value indicates greater association between the target words and the attribute words.
rnd
sum of all relative norm distances. It equals to zero when there is no bias.
rnsb
Kullback-Leibler divergence of the predicted negative probabilities, P, from the uniform distribution. A lower value indicates less bias.
ect
Spearman Coefficient of an Embedding Coherence Test. The value ranges from -1 to +1 and a larger value indicates less bias.
weat
The standardized effect size (default) can be interpreted the same way as Cohen's D.
Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186. tools:::Rd_expr_doi("10.1126/science.aal4230")
Dev, S., & Phillips, J. (2019, April). Attenuating bias in word vectors. In The 22nd International Conference on Artificial Intelligence and Statistics (pp. 879-887). PMLR.
Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018). Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences, 115(16), E3635-E3644. tools:::Rd_expr_doi("10.1073/pnas.1720347115")
Manzini, T., Lim, Y. C., Tsvetkov, Y., & Black, A. W. (2019). Black is to criminal as caucasian is to police: Detecting and removing multiclass bias in word embeddings. arXiv preprint arXiv:1904.04047.
Sweeney, C., & Najafian, M. (2019, July). A transparent framework for evaluating unintended demographic bias in word embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 1662-1667).
weat_es()
, mac_es()
, rnd_es()
, rnsb_es()
, ect_es()