factorizations: Factorizations of the different PIN likelihood functions

Description

The PIN likelihood function is derived from the original PIN model as developed by Easley1992;textualPINstimation and Easley1996;textualPINstimation. The maximization of the likelihood function as is leads to computational problems, in particular, to floating point errors. To remedy to this issue, several log-transformations or factorizations of the different PIN likelihood functions have been suggested. The main factorizations in the literature are:

fact_pin_eho(): factorization of Easley2010;textualPINstimation
fact_pin_lk(): factorization of WilliamLin2011;textualPINstimation
fact_pin_e(): factorization of Ersan2016;textualPINstimation

The factorization of the likelihood function of the multilayer PIN model, as developed in Ersan2016;textualPINstimation.

fact_mpin(): factorization of Ersan2016;textualPINstimation

The factorization of the likelihood function of the adjusted PIN model Duarte09PINstimation, is derived, and presented in Ersan2022b;textualPINstimation.

fact_adjpin(): factorization in Ersan2022b;textualPINstimation

Usage

fact_pin_eho(data, parameters = NULL)
fact_pin_lk(data, parameters = NULL)
fact_pin_e(data, parameters = NULL)
fact_mpin(data, parameters = NULL)
fact_adjpin(data, parameters = NULL)

Value

If the argument parameters is omitted, returns a function object that can be used with the optimization functions optim(), and neldermead().

If the argument parameters is provided, returns a numeric value of the log-likelihood function evaluated at the dataset data and the parameters parameters, where parameters is a numeric vector following this order (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)_b, \(\epsilon\)_s) for the factorizations of the PIN likelihood function, (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)_b, \(\epsilon\)_s) for the factorization of the MPIN likelihood function, and (\(\alpha\), \(\delta\), \(\theta\), \(\theta'\), \(\epsilon\)_b, \(\epsilon\)_s ,\(\mu\)_b, \(\mu\)_s, \(\Delta\)_b, \(\Delta\)_s) for the factorization of the AdjPIN likelihood function.

Arguments

data: A dataframe with 2 variables: the first corresponds to buyer-initiated trades (buys), and the second corresponds to seller-initiated trades (sells).
parameters: In the case of the PIN likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)_b, \(\epsilon\)_s). In the case of the MPIN likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)_b, \(\epsilon\)_s), where \(\alpha\), \(\delta\), and \(\mu\) are numeric vectors of size J, where J is the number of information layers in the data. In the case of the AdjPIN likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\theta\), \(\theta'\), \(\epsilon\)_b, \(\epsilon\)_s, \(\mu\)_b, \(\mu\)_s, \(\Delta\)_b, \(\Delta\)_s). The default value is NULL.

Details

The argument 'data' should be a numeric dataframe, and contain at least two variables. Only the first two variables will be considered: The first variable is assumed to correspond to the total number of buyer-initiated trades, while the second variable is assumed to correspond to the total number of seller-initiated trades. Each row or observation correspond to a trading day. NA values will be ignored.

Our tests, in line with WilliamLin2011;textualPINstimation, and ErsanAlici2016;textualPINstimation, demonstrate very similar results for fact_pin_lk(), and fact_pin_e(), both having substantially better estimates than fact_pin_eho().

References

Examples

Run this code

# There is a preloaded quarterly dataset called 'dailytrades' with 60
# observations. Each observation corresponds to a day and contains the
# total number of buyer-initiated trades ('B') and seller-initiated
# trades ('S') on that day. To know more, type ?dailytrades

xdata <- dailytrades

# ------------------------------------------------------------------------ #
# Using fact_pin_eho(), fact_pin_lk(), fact_pin_e() to find the likelihood #
# value as factorized by Easley(2010), Lin & Ke (2011), and Ersan(2016).   #
# ------------------------------------------------------------------------ #

# Choose a given parameter set to evaluate the likelihood function at a
# givenpoint  = (alpha, delta, mu, eps.b, eps.s)

givenpoint <- c(0.4, 0.1, 800, 300, 200)

# Use the ouput of fact_pin_e() with the optimization function optim() to
# find optimal estimates of the PIN model.

model <- suppressWarnings(optim(givenpoint, fact_pin_e(xdata)))

# Collect the model estimates from the variable model and display them.

varnames <- c("alpha", "delta", "mu", "eps.b", "eps.s")
estimates <- setNames(model$par, varnames)
show(estimates)

# Find the value of the log-likelihood function at givenpoint

lklValue <- fact_pin_lk(xdata, givenpoint)

show(lklValue)

# ------------------------------------------------------------------------ #
# Using fact_mpin() to find the value of the MPIN likelihood function as   #
# factorized by Ersan (2016).                                              #
# ------------------------------------------------------------------------ #

# Choose a given parameter set to evaluate the likelihood function at a
# givenpoint  = (alpha(), delta(), mu(), eps.b, eps.s) where alpha(), delta()
# and mu() are vectors of size 2.

givenpoint <- c(0.4, 0.5, 0.1, 0.6, 600, 1000, 300, 200)

# Use the output of fact_mpin() with the optimization function optim() to
# find optimal estimates of the PIN model.

model <- suppressWarnings(optim(givenpoint, fact_mpin(xdata)))

# Collect the model estimates from the variable model and display them.

varnames <- c(paste("alpha", 1:2, sep = ""), paste("delta", 1:2, sep = ""),
              paste("mu", 1:2, sep = ""), "eb", "es")
estimates <- setNames(model$par, varnames)
show(estimates)

# Find the value of the MPIN likelihood function at givenpoint

lklValue <- fact_mpin(xdata, givenpoint)

show(lklValue)

# ------------------------------------------------------------------------ #
# Using fact_adjpin() to find the value of the DY likelihood function as   #
# factorized by Ersan and Ghachem (2022b).                                 #
# ------------------------------------------------------------------------ #

# Choose a given parameter set to evaluate the likelihood function
# at a the initial parameter set givenpoint = (alpha, delta,
# theta, theta',eps.b, eps.s, muB, muS, db, ds)

givenpoint <- c(0.4, 0.1, 0.3, 0.7, 500, 600, 800, 1000, 300, 200)

# Use the output of fact_adjpin() with the optimization function
# neldermead() to find optimal estimates of the AdjPIN model.

low <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
up <- c(1, 1, 1, 1, Inf, Inf, Inf, Inf, Inf, Inf)
model <- nloptr::neldermead(
givenpoint, fact_adjpin(xdata), lower = low, upper = up)

# Collect the model estimates from the variable model and display them.

varnames <- c("alpha", "delta", "theta", "thetap", "eps.b", "eps.s",
              "muB", "muS", "db", "ds")
estimates <- setNames(model$par, varnames)
show(estimates)

# Find the value of the log-likelihood function at givenpoint

adjlklValue <- fact_adjpin(xdata, givenpoint)
show(adjlklValue)

Run the code above in your browser using DataLab