An “experts only” softmax fitting function for the Harville model.
harsmfit(
y,
g,
X,
wt = NULL,
eta0 = NULL,
beta0 = NULL,
normalize_wt = FALSE,
method = c("BFGS", "NR", "CG", "NM")
)
An object of class harsm
, maxLik
, and linodds
.
a vector of the ranked outcomes within each group. Only the order within a group matters.
a vector giving the group indices. Need not be integers, but
that is more efficient. Need not be sorted.
Must be the same length as y
.
a matrix of the independent variables. Must have as many rows
as the length of y
.
an optional vector of the observation level weights. These must
be non-negative, otherwise an error is thrown. Note that the weight of
the last ranked outcome within a group is essentially ignored.
Must be the same length as y
.
an optional vector of the consensus odds. These are added to
the fit odds in odds space before the likelihood caclulation. If given,
then when the model is used to predict, similar consensus odds must be
given.
Must be the same length as y
.
an optional vector of the initial estimate of beta for
‘warm start’ of the estimation procedure.
Must be the same length as number of columns in X
.
Should only affect the speed of the computation, not the results.
Defaults to all zeroes.
if TRUE
, we renormalize wt
, if given,
to have mean value 1. Note that the default value has changed
since version 0.1.0 of this package. Moreover, non-normalized
weights can lead to incorrect inference. Use with caution.
maximisation method, currently either
"NR" (for Newton-Raphson),
"BFGS" (for Broyden-Fletcher-Goldfarb-Shanno),
"BFGSR" (for the BFGS algorithm implemented in R),
"BHHH" (for Berndt-Hall-Hall-Hausman),
"SANN" (for Simulated ANNealing),
"CG" (for Conjugate Gradients),
or "NM" (for Nelder-Mead).
Lower-case letters (such as "nr" for Newton-Raphson) are allowed.
The default method is "NR" for unconstrained problems, and "NM" or
"BFGS" for constrained problems, depending on if the grad
argument was provided. "BHHH" is a good alternative given the
likelihood is returned observation-wise (see maxBHHH
).
Note that stochastic gradient ascent (SGA) is currently not supported as this method seems to be rarely used for maximum likelihood estimation.
Steven E. Pav shabbychef@gmail.com
Given a number of events, indexed by group, and a vector \(y\) of the ranks of each entry within that group, perform maximum likelihood estimation under the softmax and proportional probability model.
The user can optionally supply a vector of \(\eta_0\), which are taken as the fixed, or ‘consensus’ odds. The estimation is then conditional on these fixed odds.
Weighted estimation is supported.
The code relies on the likelihood function of harsmlik
,
and MLE code from maxLik
.
Harville, D. A. "Assigning probabilities to the outcomes of multi-entry competitions." Journal of the American Statistical Association 68, no. 342 (1973): 312-316. tools:::Rd_expr_doi("10.1080/01621459.1973.10482425")
the likelihood function, harsmlik
, and the
expected rank function (the inverse link), erank
.
nfeat <- 5
set.seed(1234)
g <- ceiling(seq(0.1,1000,by=0.1))
X <- matrix(rnorm(length(g) * nfeat),ncol=nfeat)
beta <- rnorm(nfeat)
eta <- X %*% beta
y <- rsm(eta,g)
mod0 <- harsmfit(y=y,g=g,X=X)
summary(mod0)
# now upweight finishers 1-5
modw <- harsmfit(y=y,g=g,X=X,wt=1 + as.numeric(y < 6))
summary(modw)
Run the code above in your browser using DataLab