Learn Data & AI Skills | 50% off
Get 50% off unlimited learning

LSAfun (version 0.5.3)

compose: Two-Word Composition

Description

Computes the vector of a complex expression p consisting of two single words u and v, following the methods examined in Mitchell & Lapata (2008) (see Details).

Usage

## Default 
compose(x,y,method="Add", a=1,b=1,c=1,m,k,lambda=2,
      tvectors=tvectors,breakdown=FALSE, norm="none")

Arguments

x

a single word (character vector with length(x) = 1)

y

a single word (character vector with length(y) = 1)

a,b,c

weighting parameters, see Details

m

number of nearest words to the Predicate that are initially activated (see Predication)

k

size of the k-neighborhood; k m (see Predication)

lambda

dilation parameter for method = "Dilation"

method

the composition method to be used (see Details)

norm

whether to normalize the single word vectors before applying a composition function. Setting norm = "none" will not perform any normalizations, setting norm = "all" will normalize every involved word vector. Setting norm = "block" is only valid for the Predication method

tvectors

the semantic space in which the computation is to be done (a numeric matrix where every row is a word vector)

breakdown

if TRUE, the function breakdown is applied to the input

Value

The phrase vector as a numeric vector

Details

Let p be the vector with entries pi for the two-word phrase consisiting of u with entries ui and v with entries vi. The different composition methods as described by Mitchell & Lapata (2008, 2010) are as follows:

  • Additive Model (method = "Add") pi=ui+vi

  • Weighted Additive Model (method = "WeightAdd") pi=aui+bvi

  • Multiplicative Model (method = "Multiply") pi=uivi

  • Combined Model (method = "Combined") pi=aui+bvi+cuivi

  • Predication (method = "Predication") (see Predication)

    If method="Predication" is used, x will be taken as Predicate and y will be taken as Argument of the phrase (see Examples)

  • Circular Convolution (method = "CConv") pi=jujvij, where the subscripts of v are interpreted modulo n with n= length(x)(= length(y))

  • Dilation (method = "Dilation") p=(uu)v+(λ1)(uv)u, with (uu) being the dot product of u and u (and (uv) being the dot product of u and v).

The Add, Multiply, and CConv methods are symmetrical composition methods, i.e. compose(x="word1",y="word2") will give the same results as compose(x="word2",y="word1") On the other hand, WeightAdd, Combined, Predication and Dilation are asymmetrical, i.e. compose(x="word1",y="word2") will give different results than compose(x="word2",y="word1")

References

Kintsch, W. (2001). Predication. Cognitive science, 25, 173-202.

Mitchell, J., & Lapata, M. (2008). Vector-based Models of Semantic Composition. In Proceedings of ACL-08: HLT (pp. 236-244). Columbus, Ohio.

Mitchell, J., & Lapata, M. (2010). Composition in Distributional Models of Semantics. Cognitive Science, 34, 1388-1429.

See Also

Predication

Examples

Run this code
# NOT RUN {
data(wonderland)

compose(x="mad",y="hatter",method="Add",tvectors=wonderland)

compose(x="mad",y="hatter",method="Combined",a=1,b=2,c=3,
tvectors=wonderland)

compose(x="mad",y="hatter",method="Predication",m=20,k=3,
tvectors=wonderland)

compose(x="mad",y="hatter",method="Dilation",lambda=3,
tvectors=wonderland)
# }

Run the code above in your browser using DataLab