Pointwise mutual information (PMI) is calculated as follows (see
Manning/Schuetze 1999):
$$I(x,y) = log\frac{p(x,y)}{p(x)p(y)}$$
The formula is based on maximum likelihood estimates: When we know the number
of observations for token x, \(o_{x}\), the number of observations
for token y, \(o_{y}\) and the size of the corpus N, the
propabilities for the tokens x and y, and for the co-occcurence of x and y
are as follows:
$$p(x) = \frac{o_{x}}{N}$$
$$p(y) = \frac{o_{y}}{N}$$
The term p(x,y) is the number of observed co-occurrences of x and y.
Note that the computation uses log base 2, not the natural logarithm you find
in examples (e.g. https://en.wikipedia.org/wiki/Pointwise_mutual_information).