smlik: Softmax log likelihood under Harville and Henery Models.

Description

Compute the softmax log likelihood and gradient of the same.

Usage

harsmlik(g, idx, eta, wt = NULL, deleta = NULL)
hensmlik(g, idx, eta, gamma, wt = NULL, deleta = NULL)

Arguments

a vector giving the group indices.

idx

a vector of integers, the same length as g, which describes the reverse sort order on the observations, first by group, then by place within the group. That is, the element idx[0] is the index of the last place finisher in the group g[0]; then idx[1] is the index of the next-to-last place finisher in g[1] (assuming it equals g[0]), and so on. The idx shall be zero based.

eta

a vector of the odds. Must be the same length as g.

an optional vector of non-negative elements, the same length as g, giving the observation level weights. We then compute a weighted log likelihood, where the weights are in each summand. The weights should probably have mean 1, but that's just, like, my opinion, man. Negative weights throw an error. Note that the weights for the last place in each group have no effect on the computation.

deleta

an optional matrix whose row count equals the number of elements of g, idx, and eta. The rows of deleta are the derivatives of eta with respect to some $z$. This is used to then maximize likelihood with respect to $z$.

gamma

a vector of the gamma parameters. It is assumed that the first element is $\gamma_2$, while the last element is applied to all higher order tie breaks.

Value

The log likelihood. If deleta is given, we add an attribute to the scalar number, called gradient giving the derivative. For the Henery model we also include a term of gradgamma which is the gradient of the log likelihood with respect to the gamma vector.

Details

Given vectors $g$, $\eta$ and optionally the gradient of $\eta$ with respect to some coefficients, computes the log likelihood under the softmax. The user must provide a reverse ordering as well, which is sorted first by the groups, $g$, and then, within a group, in increasing quality order. For a race, this means that the index is in order from the last place to the first place in that race, where the group numbers correspond to one race.

Under the Harville model, the log likelihood on a given group, where we are indexing in forward order, is $$\left(\eta_1 - \log \sum_{j \ge 1} \mu_j\right) + \left(\eta_2 - \log \sum_{j \ge 2} \mu_j\right) + ...$$ where $\mu_i = \exp{\eta_i}$. By “forward order”, we mean that $\eta_1$ corresponds to the participant taking first place within that group, $\eta_2$ took second place, and so on.

The Henery model log likelihood takes the form $$\left(\eta_1 - \log \sum_{j \ge 1} \mu_j\right) + \left(\gamma_2 \eta_2 - \log \sum_{j \ge 2} \mu_j^{\gamma_2}\right) + ...$$ for gamma parameters, $\gamma$. The Henery model corresponds to the Harville model where all the gammas equal 1.

Weights in weighted estimation apply to each summand. The weight for the last place participant in a group is irrelevant. The weighted log likelihood under the Harville model is $$w_1\left(\eta_1 - \log \sum_{j \ge 1} \mu_j\right) + w_2\left(\eta_2 - \log \sum_{j \ge 2} \mu_j\right) + ...$$ One should think of the weights as applying to the outcome, not the participant.

References

Harville, D. A. "Assigning probabilities to the outcomes of multi-entry competitions." Journal of the American Statistical Association 68, no. 342 (1973): 312-316. http://dx.doi.org/10.1080/01621459.1973.10482425

Henery, R. J. "Permutation probabilities as models for horse races." Journal of the Royal Statistical Society: Series B (Methodological) 43, no. 1 (1981): 86-91. http://dx.doi.org/10.1111/j.2517-6161.1981.tb01153.x

Examples

Run this code

# NOT RUN {
# a garbage example
set.seed(12345)
g <- as.integer(sort(ceiling(20 * runif(100))))
idx <- as.integer(rev(1:length(g)) - 1L)
eta <- rnorm(length(g))
foo <- harsmlik(g=g,idx=idx,eta=eta,deleta=NULL)

# an example with a real idx
nfeat <- 5
set.seed(1234)
g <- ceiling(seq(0.1,1000,by=0.1))
X <- matrix(rnorm(length(g) * nfeat),ncol=nfeat)
beta <- rnorm(nfeat)
eta <- X %*% beta
y <- rsm(eta,g)

idx <- order(g,y,decreasing=TRUE) - 1
foores <- harsmlik(g,idx,eta,deleta=X)

# now reweight them
wt <- runif(length(g))
wt <- wt / mean(wt)   # mean 1 is recommended
foores <- harsmlik(g,idx,eta,wt=wt)

# try hensmlik 
foores <- hensmlik(g,idx,eta,gamma=c(0.9,0.8,1),wt=wt)

# check the value of the gradient by numerical approximation
# }
# NOT RUN {
 nfeat <- 8
 set.seed(321)
 g <- ceiling(seq(0.1,1000,by=0.1))
 X <- matrix(rnorm(length(g) * nfeat),ncol=nfeat)
 beta <- rnorm(nfeat)
 eta <- X %*% beta
 y <- rsm(eta,g)
 
 idx <- order(g,y,decreasing=TRUE) - 1
 if (require(numDeriv)) {
 
 	fastval <- attr(harsmlik(g,idx,eta,deleta=X),'gradient')
 	numap <- grad(function(beta,g,idx,X) { 
 			 eta <- X %*% beta
 			 as.numeric(harsmlik(g,idx,eta))
 			 },
 			 x=beta,g=g,idx=idx,X=X)
 	rbind(fastval,numap)
 }
# }

Run the code above in your browser using DataLab