hedge.pci: Searches for a partially cointegrated hedge for a given time series

Description

Given a time series and a collection of possible factors, finds a subset of the factors that provides the best fit to the given time series using the partially cointegrated model.

Usage

hedge.pci(Y, X, 
  maxfact = 10, 
  lambda = 0, 
  use.multicore = TRUE, 
  minimum.stepsize = 0, 
  verbose = TRUE, 
  exclude.cols = c(), 
  search_type = c("lasso", "full", "limited"), 
  pci_opt_method=c("jp", "twostep"), 
  ...)

Arguments

An N x 1 column vector or data data.frame, representing the series that is to be hedged.

An N x L data.frame, where each column represents a possible factor to be used in a partially cointegrated fit.

maxfact

The maximum number of columns from X that will be selected for modeling Y. Default: 10

lambda

A penalty to be applied to the random walk portion of the partialAR model. A positive value for lambda will drive the model towards a solution with a smaller random walk component. Default: 0

use.multicore

If TRUE, parallel processing will be used to improve performance. See parallel:mclapply Default: TRUE

minimum.stepsize

If this is non-NA, then the search stops if an improvement cannot be found of at least this much. Default: 0

verbose

If TRUE, then detailed information is printed about the execution. Default: TRUE

exclude.cols

A list of column indexes specifying columns from X which should be excluded from consideration. Alternatively, the list of excluded columns may be given as a list of strings, in which case they are interepreted as column names. Default: c()

search_type

If "lasso", then the lasso algorithm (see glmnet) is used to identify the factors that provide the best linear fit to the target sequence. If "full", then a greedy algorithm is used to search for factors to be used in the hedge. At each step, all possible additions to the portfolio are considered, and the best one is chosen for inclusion. If "limited", then at each iteration, a preliminary screening step is performed to identify the securities with the highest correlations to the residuals of the currently selected portfolio. The top securities from this list are then checked for whether they would improve the portfolio, and the best one included.

pci_opt_method

Specifies the method that will be used for finding the best fitting model. One of the following:

"jp" The joint-penalty method (see fit.pci)
"twostep" The two-step method (see fit.pci)

Default: jp

…

Other parameters to be passed onto the search function. See the source code.

Value

Returns an S3 object of class pci.hedge containing the following fields

pci

The best partially cointegrated fit that was found

indexes

The indexes of the columns from X that were selected

index_names

The names of the columns from X that were selected

Details

The hedge is constructed by searching for column indices i1,i2, ..., iN from among the columns of X which yield the best fit to the partially cointegrated fit:

$$ Y_t = \beta_1 * X_{t,i1} + beta_2 * X_{t,i2} + ... + beta_N * X_{t,iN} + M_t + R_t$$ $$M_t = \rho M_{t-1} + \epsilon_{M,t}$$ $$R_t = R_{t-1} + \epsilon_{R,t}$$ $$-1 < \rho < 1$$ $$\epsilon_{M,t} \sim N(0,\sigma_M^2)$$ $$\epsilon_{R,t} \sim N(0,\sigma_R^2)$$

if search_type="lasso" is specified, then the lasso algorithm (see glmnet) is used to search for the factors that give the best linear fit to the target sequence Y. Having determined the list of factors, the cutoff point is determined based successive improvements to the likelihood score of the fitted model.

Otherwise, a greedy algorithm (search_type="full") or a modified greedy algorithm (search_type="limited") is used. This proceeds by searching through all columns of X (except those listed in exclude.cols) to find the column that gives the best fit to Y, as determined by the likelihood score of the partially cointegrated model. This column becomes the initial hedging portfolio. Having selected columns i1, i2, ..., iK, the next column is found by searching through all remaining columns of X (except those listed in exclude.cols) for the column which gives the best improvement to the partially cointegrated fit. However, if the best improvement is less than minimum.stepsize, or if maxfact columns have already been added, then the search terminates.

In the case of the modified greedy algorithm (search_type="limited"), a preprocessing step is used at the beginning of each iteration. In this preprocessing step, the correlation is computed between each unused column of X and the residual series of the currently computed best fit. The top B choices are then considered for inclusion in the portfolio, where B is a branching factor. The branching factor can be controlled by setting the value of the optional parameter max.branch. Its default value is 10.

The lasso algorithm is by far the fastest, followed by the limited greedy search. So, the best strategy is probably to start by using the lasso. If it fails to produce acceptable results, then move on to the limited greedy algorithm and finally the full search.

Examples

Run this code

# NOT RUN {
##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

# }
# NOT RUN {
YX <- rpci(n=1000, beta=c(2,3,4,5,6), 
  sigma_C=c(0.1,0.1,0.1,0.1,0.1), rho=0.9, sigma_M=1, sigma_R=1)
YXC <- cbind(YX, matrix(rnorm(5000), ncol=5))
hedge.pci(YX[,1], YX[,2:ncol(YX)])
hedge.pci(YXC[,1], YXC[,2:ncol(YXC)])
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab