NFCP_Kalman_filter: Filter an N-factor commodity pricing model though the Kalman filter

Description

Given a set of parameters of the N-factor model, filter term structure data using the Kalman filter.

Usage

NFCP_Kalman_filter(
  parameter_values,
  parameter_names,
  log_futures,
  dt,
  futures_TTM,
  ME_TTM = NULL,
  verbose = FALSE,
  debugging = FALSE,
  seasonal_trend = NULL
)

Value

NFCP_Kalman_filter returns a numeric object when verbose = F, which corresponds to the log-likelihood of observations. When verbose = T, the NFCP_Kalman_filter function returns a list object of length seven with the following objects:

`Log-Likelihood`	Log-Likelihood of observations.
`Information Criteria`	`vector`. The Akaikie and Bayesian Information Criterion.
`X_t`	`vector`. The final observation of the state vector.
`X`	`matrix`. Optimal one-step-ahead state vector.
`Y`	`matrix`. Estimated futures prices.
`V`	`matrix`. Estimation error.
`Filtered Error`	`matrix`. positive mean error (high bias), negative mean error (low bias),

mean error (bias) and root mean squared error (RMSE)

of the filtered values to observed futures prices. Term Structure Fitmatrix. The mean error (Bias), mean absolute error, standard deviation of error

and root mean squared error (RMSE) of each observed futures contract. Term Structure Volatility Fitmatrix. Theoretical and empirical volatility of observed futures contract returns.

When debugging = T, 9 objects are returned in addition to those returned when verbose = T:

`P_t`	`array`. Covariance matrix of state variables, with the third dimension indexing across time
`F_t`	`vector`. Prediction error variance matrix, with the third dimension indexing across time
`K_t`	`matrix`. Kalman Gain, with the third dimension indexing across time
`d`	`matrix`. d_td[t] (see details)
`Z`	`matrix`. Z_tz[t] (see details)
`G_t`	`matrix`. G_tG[t] (see details)
`c_t`	`vector`. C_tc[t] (see details)
`Q_t`	`matrix`. Q_tQ[t] (see details)
`H`	`matrix`. HH (see details)

Arguments

parameter_values: vector. Numeric parameter values of an N-factor model.
parameter_names: vector. Parameter names, where each element of parameter_names must correspond to its respective value element in object parameter_values.
log_futures: matrix. The natural logarithm of observed futures prices. Each row must correspond to quoted futures prices at a particular date and every column must correspond to a unique futures contract. NA values are allowed.
dt: numeric. Constant, discrete time step of observations, in years.
futures_TTM: vector or matrix. The time-to-maturity of observed futures contracts, in years, at a given observation date. This time-to-maturity can either be constant (ie. class 'vector') or variable (ie. class 'matrix') across observations. The number of columns of 'futures_TTM' must be identical to the number of columns of object 'log_futures'. The number of rows of object 'futures_TTM' must be either 1 or equal to the number of rows of object 'log_futures'.
ME_TTM: vector. the time-to-maturity groupings to consider for observed futures prices. The length of ME_TTM must be equal to the number of 'ME' parameters specified in object 'parameter_names'. The maximum of 'ME_TTM' must be greater than the maximum value of 'futures_TTM'. When the number of 'ME' parameter values is equal to one or the number of columns of object 'log_futures', this argument is ignored.
verbose: logical. Should additional information be output? see values. When verbose = F, the NFCP_Kalman_filter function is significantly faster, see details.
debugging: logical. Should additional filtering information be output? see values.
seasonal_trend: numeric. Optional parameter. This details the trend of the deterministic seasonal component (i.e., where in the season the first observation is located). When not listed, the Kalman filter assumes that observations are at the beginning of the seasonal component.

Details

NFCP_Kalman_filter applies the Kalman filter algorithm for observable log_futures prices against the input parameters of an N-factor model provided through the parameter_values and parameter_names input vectors.

The NFCP_Kalman_filter function is designed for subsequent input into optimization functions and is called within the N-factor parameter estimation function NFCP_MLE. The first input to the NFCP_Kalman_filter function is a vector of parameters of an N-factor model, with elements of this vector corresponding to the parameter names within the elements of input vector parameter_names. When logical input verbose = F, the NFCP_Kalman_filter function calls the fkf_SP function of the FKF_SP package, which itself is a wrapper of a routine of the Kalman filter written in C utilizing Sequential Processing for maximum computational efficiency (see fkf_SP for more details). When verbose = T, the NFCP_Kalman_filter instead applies a Kalman filter algorithm written in base R and outputs several other list objects, including filtered values and measures for model fit and robustness (see Returns)

The N-factor model The N-factor framework was first presented in the work of Cortazar and Naranjo (2006, equations 1-3). It is a risk-premium class of commodity pricing model, in which futures prices are given by discounted expected future spot prices, where these spot prices are discounted at a given level of risk-premium, known as the cost-of-carry.

The N-factor framework describes the spot price process of a commodity as the correlated sum of NN state variables x_tx[t]. The 'NFCP' package also allows for a deterministic, cyclical seasonal function season(t)season(t) to be considered.

When GBM = TRUE: log(S_t) = season(t) + _i=1^N x_i,tlog(S[t]) = season(t) + sum_i=1^n x[i,t] When GBM = FALSE: log(S_t) = E + season(t) + _i=1^N x_i,tlog(S[t]) = E + season(t) + sum_i=1^n x[i,t]

Where GBM determines whether the first factor follows a Brownian Motion or Ornstein-Uhlenbeck process to induce a unit root in the spot price process.

When GBM = TRUE, the first factor corresponds to the spot price, and additional N-1 factors model the cost-of-carry.

When GBM = FALSE, the commodity model assumes that there is a long-term equilibrium the commodity price will tend towards over time, with model volatility a decreasing function of time. This is not the standard approach made in the commodity pricing literature (Cortazar and Naranjo, 2006).

State variables are thus assumed to follow the following processes:

When GBM = TRUE: dx_1,t = ^*dt + _1 dw_1tdx[1,t] = mu^* dt + sigma[1] dw[1]t

When GBM = FALSE: dx_1,t = - (_1 + _1x_1,t)dt + _1 dw_1tdx[1,t] = - (lambda[1] + kappa[1] x[1,t]) dt + sigma[1] dw[t]t

And: dx_i,t =_i1 - (_i + _ix_i,t)dt + _i dw_itdx[i,t] =_(i != 1) - (lambda[i] + kappa[i] x[i,t]dt + sigma[i] dw[i]t)

where: E(w_i)E(w_j) = _i,jE(w[i])E(w[j])

Additionally, the deterministic seasonal function (if specified) is given by:

season(t) = _i=1 ( season_i,1 cos(2i) + season_i,2 sin(2i)season(t) = sum_i=1 ( season_i,1 cos(2.i.pi) + season_i,2 sin(2.i.pi)

The addition of deterministic, cyclical seasonality as a function of trigonometric variables was first suggested by Hannan, Terrell, and Tuckwell (1970) and first applied to model commodities by Sørensen (2002).

The following constant parameters are defined as:

var mu: long-term growth rate of the Brownian Motion process.

var EE: Constant equilibrium level.

var ^*=-_1mu^* = mu-lambda[1]: Long-term risk-neutral growth rate

var _ilambda[i]: Risk premium of state variable ii.

var _ikappa[i]: Reversion rate of state variable ii.

var _isigma[i]: Instantaneous volatility of state variable ii.

var _i,j [-1,1]rho[i,j] in [-1,1]: Instantaneous correlation between state variables ii and jj.

Including additional factors within the spot-price process allow for additional flexibility (and possibly fit) to the term structure of a commodity. The N-factor model nests simpler models within its framework, allowing for the fit of different N-factor models (applied to the same term structure data), represented by the log-likelihood, to be directly compared with statistical testing possible through a chi-squared test.

Disturbances - Measurement Error:

The Kalman filtering algorithm assumes a given measure of measurement error or disturbance in the measurement equation (ie. matrix HH). Measurement errors can be interpreted as error in the model's fit to observed prices, or as errors in the reporting of prices (Schwartz and Smith, 2000). These disturbances are typically assumed independent.

var ME_iME[i] measurement error of contract ii.

where the measurement error of futures contracts ME_iME[i] is equal to 'ME_' [i] (i.e. 'ME_1', 'ME_2', ...) specified in arguments parameter_values and parameter_names.

There are three particular cases on how the measurement error of observations can be treated in the NFCP_Kalman_filter function:

Case 1: Only one ME is specified. The Kalman filter assumes that the measurement error of observations are independent and identical.

Case 2: One ME is specified for every observed futures contract. The Kalman filter assumes that the measurement error of observations are independent and unique.

Case 3: A series of ME's are specified for a given grouping of maturities of futures contracts. The Kalman filter assumes that the measurement error of observations are independent and unique to their respective time-to-maturity.

Grouping of maturities for case 3 is specified through the ME_TTM argument. This is a vector that specifies the maximum maturity to consider for each respective ME parameter argument.

in other words, ME_1 is considered for observations with TTM less than ME_TTM[1], ME_2 is considered for observations with TTM less than ME_TTM[2], ..., etc.

The first case is clearly the simplest to estimate, but can be a restrictive assumption. The second case is clearly the most difficult to estimate, but can be an infeasible assumption when considering all available futures contracts that make up the term structure of a commodity.

Case 3 thus serves to ease the restriction of case 1, and allow the user to make the modeling of measurement error as simple or complex as desired for a given set of maturities.

Kalman Filtering

The following section describes the Kalman filter equations used to filter the N-factor model.

The Kalman filter iteration is characterised by a transition and measurement equation. The transition equation develops the vector of state variables between discretised time steps (whilst considering a given level of covariance between state variables over time). The measurement equation relates the unobservable state vector to a vector of observable measurements (whilst also considering a given level of measurement error). The typical Kalman filter algorithm is a Gaussian process state space model.

Transition Equation: x_t|t-1 = c_t + G_t x_t-1 + Q_t _t hat(x)[t|t-1] = c[t] + G[t] * hat(x)[t-1] Measurement Equation: y_t = d_t + Z_t x_t|t-1 + H_t _that(y)[t] = d[t] + Z[t] * hat(x)[t|t-1]

t = 1, , n t = 1, ..., n

Where _teta[t] and _tepsilon[t] are IID N(0,I(m))N(0,I(m)) and iid N(0,I(d))N(0,I(d)) respectively.

The state vector follows a normal distribution, x_1 N(a_1, P_1)x[1] ~ N(a[1], P[1]), with a_1a[1] and P_1P[1] as the mean vector and variance matrix of the initial state vector x_1x[1], respectively.

The Kalman filter can be used for parameter estimation through the maximization of the Log-Likelihood value. See NFCP_MLE.

Filtering the N-factor model

let mm represent the number of observations at time tt

let nn represent the number of factors in the N-factor model

observable futures prices: y_t = [ln(F(t,T_1)), ln(F(t,T_2)), , ln(F(t,T_m))]'y[t] = [ln(F(t,T[1])), ln(F(t,T[2])), ..., ln(F(t,T[m]))]'

State vector: x_t=[x_1t,x_2t,,x_nt ]'x[t] = [x[1t], x[2t], ..., x[nt]]'

Measurement error: diag(H) = [ME_1^2, ME_2^2, , ME_n^2]diag(H) = [ME[1]^2, ME[2]^2, ..., ME[n]^2]

When the number of specified ME terms is one, s_1 = s_2 = = s_n = s[1] = s[2] = ... = s[n] = ME_1^2ME[1]^2

var ZZ is an m nm X n matrix, where each element [i,j][i,j] is equal to:

Z_i,j = e^-_i T_jZ[i,j] = e^(-kappa[i] * T[j])

var d_td[t] is an m 1m X 1 vector:

d_t=[season(T_1) + A(T_1), season(T_2) + A(T_2), , season(T_m) + A(T_m)]'d[t]=[season(T[1]) + A(T[1]), season(T[2]) + A(T[2]), ..., season(T[m]) + A(T[m])]'

Under the assumption that Factor 1 follows a Brownian Motion, \(A(T)\) is given by: A(T) = ^*T-_i=1^N - 1-e^-_i T_i_i+12(_1^2T + _i.j 1 _i _j _i,j 1-e^-(_i+_j)T_i+_j)A(T) = mu^* * T - sum_i=1^N (1-e^(-kappa[i] T) lambda[i])/(kappa[i]) + 1/2 (sigma[1]^2 * T) + sum_i.j != 1 sigma[i] sigma[j] rho[i,j] (1 - e^(-(kappa[i] + kappa[j]) * T)) / (kappa[i] + kappa[j])

var v_tv[t] is a n 1n X 1 vector of serially uncorrelated Guassian disturbances with E(V_t) = 0E(V[t]) = 0 and cov(v_t)=R^2cov(v[t])=R^2

Where:

diag(G_t) = [e^-_1 , e^-_2 , , e^-_n ]diag(G[t]) = [e^-kappa[1] tau, e^-kappa[2] tau, ..., e^-kappa[n] tau

Where =T-ttau = T - t

var w_tw[t] is an n 1n X 1 vector of serially uncorrelated Guassian disturbances where: E(w_t) = 0E(w[t]) = 0 and cov(w_t) = Q_tcov(w[t]) = Q[t]

var c_t=[ t,0,,0]'c[t] = [mu * Delta t, 0, ..., 0]' is an N 1N X 1 vector of the intercept of the transition equation.

var Q_tQ[t] is equal to the covariance function, given by:

Cov_1,1(x_1,t,x_1,t) = _1^2tCov[1,1](x[1,t],x[1,t]) = sigma[1]^2 * t Cov_i,j(x_i,t,x_j,t) = _i_j_i,j1-e^-(_i+_j)t_i+_jCov[i,j](x[i,t],x[j,t]) = sigma[i] sigma[j] rho[i,j] (1-e^-(kappa[i]+kappa[j]) * t) / (kappa[i] + kappa[j]) (see also cov_func)

Penalising poorly specified models

The Kalman filter returns non-real log-likelihood scores when the prediction error variance matrix becomes singular or its determinant becomes negative. This generally occurs when a poorly specified parameter set is input, such as when measurement error is zero. Non-real log-likelihood scores can break optimization and gradients algorithms and functions. To circumvent this, the NFCP_Kalman_filter returns a heavily penalized log-likelihood score when verbose = F. Penalized log-likelihood scores are calculated by:

stats::runif(1, -2e6, -1e6)

Diffuse Kalman filtering

If the initial values of the state vector are not supplied within the parameter_names and parameter_values vectors, a 'diffuse' assumption is used within the Kalman filtering algorithm. Initial states of factors that follow an Ornstein-Uhlenbeck are assumed to equal zero. The initial state of the first factor, given that it follows a Brownian motion, is assumed equal to the first element of log_futures. This is an assumption that the initial estimate of the spot price is equal to the closest to maturity observed futures price.

The initial states of factors that follow an Ornstein-Uhlenbeck have a transient effect on future observations. This makes the diffuse assumption reasonable and further means that initial states cannot generally be accurately estimated.

References

Hannan, E. J., et al. (1970). "The seasonal adjustment of economic time series." International economic review, 11(1): 24-52.

Anderson, B. D. O. and J. B. Moore, (1979). Optimal filtering Englewood Cliffs: Prentice-Hall.

Fahrmeir, L. and G. tutz,(1994) Multivariate Statistical Modelling Based on Generalized Linear Models. Berlin: Springer.

Schwartz, E. S., and J. E. Smith, (2000). Short-Term Variations and Long-Term Dynamics in Commodity Prices. Manage. Sci., 46, 893-911.

Sørensen, C. (2002). "Modeling seasonality in agricultural commodity futures." Journal of Futures Markets: Futures, Options, and Other Derivative Products 22(5): 393-426.

Cortazar, G., and L. Naranjo, (2006). An N-factor Gaussian model of oil futures prices. Journal of Futures Markets: Futures, Options, and Other Derivative Products, 26(3), 243-268.

Durbin, J., and S. J. Koopman, (2012). Time series analysis by state space methods. Oxford university press.

Examples

Run this code



##Example 1 - complete, stitched data.
##Replicating the Schwartz and Smith (2000)
##Two-Factor commodity pricing model applied to crude oil:

SS_stitched_filtered <- NFCP_Kalman_filter(
parameter_values = SS_oil$two_factor,
parameter_names = names(SS_oil$two_factor),
log_futures = log(SS_oil$stitched_futures),
futures_TTM = SS_oil$stitched_TTM,
## maturity groupings need not be considered here:
ME_TTM = NULL,
dt = SS_oil$dt,
verbose = FALSE)


##Example 2 - incomplete, contract data.
##Replicating the Schwartz and Smith (2000)
##Two-Factor commodity pricing model applied to all available
##crude oil contracts:

SS_2F <- SS_oil$two_factor
##omit stitched contract white noise
SS_2F <- SS_2F[!grepl("ME",
             names(SS_2F))]

# Evaluate two different measurement errors
SS_2F[c("ME_1", "ME_2")] <- c(0.01, 0.04)

## Separate measurement error into two different maturity groupings
SS_ME_TTM <- c(1,3)
## ME_1 is applied for observed contracts with less than one year
## maturity, whilst ME_2 considers contracts with maturity greater
## than one year, and less than three years

#Kalman filter
SS_contract_filtered <- NFCP_Kalman_filter(
parameter_values = SS_2F,
parameter_names = names(SS_2F),
## All available contracts are considered
log_futures = log(SS_oil$contracts),
## Respective 'futures_TTM' of these contracts are input:
futures_TTM = SS_oil$contract_maturities,
ME_TTM = SS_ME_TTM,
dt = SS_oil$dt,
verbose = FALSE)

Run the code above in your browser using DataLab