Below, X and Y refers to the random variable and x and y refer to a specific realization from this random variable.
state estimates (x)
For type="xtT"
, tsSmooth.marssMLE
returns the confidence intervals of the state at time \(t\) conditioned on the data from 1 to \(T\) using the estimated model parameters as true values. These are the standard intervals that are shown for the estimated states in state-space models. For example see, Shumway and Stoffer (2000), edition 4, Figure 6.4. As such, this is probably what you are looking for if you want to put intervals on the estimated states (the \(\mathbf{x}\)). However, these intervals do not include parameter uncertainty. If you want state residuals (for residuals analysis), use MARSSresiduals()
or residuals()
.
Quantiles The state \(\mathbf{X}_t\) in a MARSS model has a conditional multivariate normal distribution, that can be computed from the model parameters and data. In Holmes (2012, Equation 11) notation, its expected value conditioned on all the observed data and the model parameters \(\Theta\) is denoted \(\tilde{\mathbf{x}}_t\) or equivalently \(\mathbf{x}_t^T\) (where the $T$ superscript is not a power but the upper extent of the time conditioning). In MARSSkf
, this is xtT[,t]
. The variance of \(\mathbf{X}_t\) conditioned on all the observed data and \(\Theta\) is \(\tilde{\mathbf{V}}_t\) (VtT[,,t]
). Note that VtT[,,t] != B VtT[,,t-1] t(B) + Q
, which you might think by looking at the MARSS equations. That is because the variance of \(\mathbf{W}_t\) conditioned on the data (past, current and FUTURE) is not equal to \(\mathbf{Q}\) (\(\mathbf{Q}\) is the unconditional variance).
\(\mathbf{x}_t^T\) (xtT[,t]
) is an estimate of \(\mathbf{x}_t\) and the standard error of that estimate is given by \(\mathbf{V}_t^T\) (VtT[,,t]
). Let se.xt
denote the sqrt of the diagonal of VtT
. The equation for the \(\alpha/2\) confidence interval is (qnorm(alpha/2)*se.xt + xtT
). \(\mathbf{x}_t\) is multivariate and this interval is for one of the \(x\)'s in isolation. You could compute the m-dimensional confidence region for the multivariate \(\mathbf{x}_t\), also, but tsSmooth.marssMLE
returns the univariate confidence intervals.
The variance VtT
gives information on the uncertainty of the true location of \(\mathbf{x}_t\) conditioned on the observed data. As more data are collected (or added to the analysis), this variance will shrink since the data, especially data at time \(t\), increases the information about the locations of \(\mathbf{x}_t\). This does not affect the estimation of the model parameters, those are fixed (we are assuming), but rather our information about the states at time \(t\).
If you have a DFA model (form='dfa'), you can pass in rotate=TRUE
to return the rotated trends. If you want the rotated loadings, you will need to compute those yourself:
dfa <- MARSS(t(harborSealWA[,-1]), model=list(m=2), form="dfa")
Z.est <- coef(dfa, type="matrix")$Z
H.inv <- varimax(coef(dfa, type="matrix")$Z)$rotmat
Z.rot <- Z.est %*% H.inv
For type="xtt"
and type=="xtt1"
, the calculations and interpretations of the intervals are the same but the conditioning is for data \(t=1\) to \(t\) or \(t=1\) to \(t-1\).
observation estimates (y)
For type="ytT"
, this returns the expected value and standard error of \(\mathbf{Y}_t\) (left-hand side of the \(\mathbf{y}\) equation) conditioned on \(\mathbf{Y}_t=y_t\). If you have no missing data, this just returns your data set. But you have missing data, this what you want in order to estimate the values of missing data in your data set. The expected value of \(\mathbf{Y}_t|\mathbf{Y}=\mathbf{y}(1:T)\) is in ytT
in MARSShatyt()
output and the variance is OtT-tcrossprod(ytT)
from the MARSShatyt()
output.
The intervals reported by tsSmooth.marssMLE
for the missing values take into account all the information in the data, specifically the correlation with other data at time \(t\) if \(\mathbf{R}\) is not diagonal. This is what you want to use for interpolating missing data. You do not want to use predict.marssMLE()
as those predictions are for entirely new data sets and thus will ignore relevant information if \(\mathbf{y}_t\) is multivariate, not all \(\mathbf{y}_t\) are missing, and the \(\mathbf{R}\) matrix is not diagonal.
The standard error and confidence interval for the expected value of the missing data along with the standard deviation and prediction interval for the missing data are reported. The former uses the variance of \(\textrm{E}[\mathbf{Y}_t]\) conditioned on the data while the latter uses variance of \(\mathbf{Y}_t\) conditioned on the data. MARSShatyt()
returns these variances and expected values. See Holmes (2012) for a discussion of the derivation of expectation and variance of \(\mathbf{Y}_t\) conditioned on the observed data (in the section 'Computing the expectations in the update equations').
For type="ytt"
, only the estimates are provided. MARSShatyt()
does not return the necessary variances matrices for the standard errors for this cases.