Given a time series and a collection of possible factors, finds a subset of the factors that provides the best fit to the given time series using the partially cointegrated model.
hedge.pci(Y, X,
maxfact = 10,
lambda = 0,
use.multicore = TRUE,
minimum.stepsize = 0,
verbose = TRUE,
exclude.cols = c(),
search_type = c("lasso", "full", "limited"),
pci_opt_method=c("jp", "twostep"),
...)
An N x 1
column vector or data data.frame
,
representing the series that is to be hedged.
An N x L
data.frame
, where each column represents a possible factor to
be used in a partially cointegrated fit.
The maximum number of columns from X
that will be selected for modeling Y
.
Default: 10
A penalty to be applied to the random walk
portion of the partialAR model. A positive value for lambda
will drive the model towards a solution with a smaller random walk component.
Default: 0
If TRUE
, parallel processing will be used to improve performance.
See parallel:mclapply
Default: TRUE
If this is non-NA, then the search stops if an improvement cannot be found of at least this much. Default: 0
If TRUE
, then detailed information is printed about
the execution.
Default: TRUE
A list of column indexes specifying columns from X
which
should be excluded from consideration. Alternatively, the list of
excluded columns may be given as a list of strings, in which case
they are interepreted as column names.
Default: c()
If "lasso", then the lasso algorithm (see glmnet
) is
used to identify the factors that provide the best linear fit to
the target sequence.
If "full", then a greedy algorithm is used to search for
factors to be used in the hedge. At each step, all possible additions
to the portfolio are considered, and the best one is chosen for inclusion.
If "limited", then at each iteration, a preliminary screening step is performed
to identify the securities with the highest correlations to the residuals
of the currently selected portfolio. The top securities from this list are
then checked for whether they would improve the portfolio, and the best one
included.
Other parameters to be passed onto the search function. See the source code.
Returns an S3 object of class pci.hedge
containing the following fields
The best partially cointegrated fit that was found
The indexes of the columns from X
that were selected
The names of the columns from X
that were selected
The hedge is constructed by searching for column indices i1,i2, ..., iN
from among the columns of X
which yield the best fit to the partially
cointegrated fit:
$$ Y_t = \beta_1 * X_{t,i1} + beta_2 * X_{t,i2} + ... + beta_N * X_{t,iN} + M_t + R_t$$ $$M_t = \rho M_{t-1} + \epsilon_{M,t}$$ $$R_t = R_{t-1} + \epsilon_{R,t}$$ $$-1 < \rho < 1$$ $$\epsilon_{M,t} \sim N(0,\sigma_M^2)$$ $$\epsilon_{R,t} \sim N(0,\sigma_R^2)$$
if search_type="lasso"
is specified, then the lasso algorithm
(see glmnet
) is used to search for the factors that give
the best linear fit to the target sequence Y
. Having determined
the list of factors, the cutoff point is determined based successive
improvements to the likelihood score of the fitted model.
Otherwise, a greedy algorithm (search_type="full"
) or a modified greedy algorithm
(search_type="limited"
) is used. This proceeds by searching through all
columns of X
(except those listed in exclude.cols
) to find the
column that gives the best fit to Y
, as determined by
the likelihood score of the partially cointegrated model. This column becomes the initial
hedging portfolio. Having selected columns i1, i2, ..., iK
, the next
column is found by searching through all remaining columns of X
(except those
listed in exclude.cols
) for the column which gives the best improvement
to the partially cointegrated fit. However, if the best improvement is less than
minimum.stepsize
, or if maxfact
columns have already been added,
then the search terminates.
In the case of the modified greedy algorithm (search_type="limited"
), a
preprocessing step is used at the beginning of each iteration. In this preprocessing
step, the correlation is computed between each unused column of X
and the
residual series of the currently computed best fit. The top B
choices are then
considered for inclusion in the portfolio, where B
is a branching factor.
The branching factor can be controlled by setting the value of the optional parameter
max.branch
. Its default value is 10.
The lasso
algorithm is by far the fastest, followed by the limited
greedy search.
So, the best strategy is probably to start by using the lasso
. If it fails to
produce acceptable results, then move on to the limited
greedy algorithm and finally
the full
search.
fit.pci
Fitting of partially cointegrated models
partialAR
Partially autoregressive models
egcm
Engle-Granger cointegration model
# NOT RUN {
##---- Should be DIRECTLY executable !! ----
##-- ==> Define data, use random,
##-- or do help(data=index) for the standard data sets.
# }
# NOT RUN {
YX <- rpci(n=1000, beta=c(2,3,4,5,6),
sigma_C=c(0.1,0.1,0.1,0.1,0.1), rho=0.9, sigma_M=1, sigma_R=1)
YXC <- cbind(YX, matrix(rnorm(5000), ncol=5))
hedge.pci(YX[,1], YX[,2:ncol(YX)])
hedge.pci(YXC[,1], YXC[,2:ncol(YXC)])
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab