We assume that for each component
\(k_{1}^{i}=k_{2}^{i}\), that is, the number of lags of \(\mathbf{z}_{t}\) used to
define the dynamic principal component and the number of lags of
\(\widehat{f}^{i}_{t}\) used to reconstruct the original series are the same. The number of components and lags
is chosen to minimize the cross-validated forecasting error in a
stepwise fashion.
Suppose we want to make \(h\)-steps ahead forecasts.
Let \(w=\) window_size
.
Then given \(k\in\) k_list
we compute the first ODPC
defined using \(k\) lags, using periods \(1,\dots,T-h-t+1\) for \(t=1,\dots,w\), and for each
of these fits we compute an h-steps ahead forecast and the corresponding
mean squared error \(E_{t,h}\). The cross-validation estimate of the forecasting error
is then
$$
\widehat{MSE}_{1,k}=\frac{1}{w}\sum\limits_{t=1}^{w}E_{t,h}.
$$
We choose for the first component the value \(k^{\ast,1}\) that minimizes \(\widehat{MSE}_{1,k}\).
Then, we fix the first component computed with \(k^{\ast,1}\) lags and repeat the
procedure with the second component. If the optimal cross-validated
forecasting error using the two components, \(\widehat{MSE}_{2,k^{\ast,2}}\) is larger than the one using only
one component, \(\widehat{MSE}_{1,k^{\ast,1}}\), we stop and output as a final model the ODPC computed using one component
defined with \(k^{\ast,1}\) lags; otherwise, if max_num_comp
\(\geq 2\) we add the second component defined using \(k^{\ast,2}\) lags and proceed as before.
This method can be computationally costly, especially for large values of the window_size
. Ideally, the user should set
n_cores_k
equal to the length of k_list
and n_cores_w
equal to window_size
; this would entail using
n_cores_k
times n_cores_w
cores in total.