Learn R Programming

sweater (version 0.1.8)

semaxis: Characterise word semantics using the SemAxis framework

Description

This function calculates the axis and the score using the SemAxis framework proposed in An et al (2018). If possible, please use query() instead.

Usage

semaxis(w, S_words, A_words, B_words, l = 0, verbose = FALSE)

Value

A list with class "semaxis" containing the following components:

  • $P for each of words in S, the score according to SemAxis

  • $V the semantic axis vector

  • $S_words the input S_words

  • $A_words the input A_words

  • $B_words the input B_words

Arguments

w

a numeric matrix of word embeddings, e.g. from read_word2vec()

S_words

a character vector of the first set of target words. In an example of studying gender stereotype, it can include occupations such as programmer, engineer, scientists...

A_words

a character vector of the first set of attribute words. In an example of studying gender stereotype, it can include words such as man, male, he, his.

B_words

a character vector of the second set of attribute words. In an example of studying gender stereotype, it can include words such as woman, female, she, her.

l

an integer indicates the number of words to augment each word in A and B based on cosine , see An et al (2018). Default to 0 (no augmentation).

verbose

logical, whether to display information

References

An, J., Kwak, H., & Ahn, Y. Y. (2018). SemAxis: A lightweight framework to characterize domain-specific word semantics beyond sentiment. arXiv preprint arXiv:1806.05521.

Examples

Run this code
data(glove_math)
S1 <- c("math", "algebra", "geometry", "calculus", "equations",
"computation", "numbers", "addition")
A1 <- c("male", "man", "boy", "brother", "he", "him", "his", "son")
B1 <- c("female", "woman", "girl", "sister", "she", "her", "hers", "daughter")
semaxis(glove_math, S1, A1, B1, l = 0)$P

Run the code above in your browser using DataLab