continuous_ss_sdf: SDF model selection with continuous spike-and-slab prior

Description

This function provides the SDF model selection procedure using the continuous spike-and-slab prior. See Propositions 3 and 4 in bryzgalova2023bayesian;textualBayesianFactorZoo.

Usage

continuous_ss_sdf(
  f,
  R,
  sim_length,
  psi0 = 1,
  r = 0.001,
  aw = 1,
  bw = 1,
  type = "OLS",
  intercept = TRUE
)

Value

The return of continuous_ss_sdf is a list of the following elements:

gamma_path: A sim_length$\times k$ matrix of the posterior draws of $\gamma$. Each row represents a draw. If $\gamma_j = 1$ in one draw, factor $j$ is included in the model in this draw and vice verse.
lambda_path: A sim_length$\times (k+1)$ matrix of the risk prices $\lambda$ if intercept = TRUE. Each row represents a draw. Note that the first column is $\lambda_c$ corresponding to the constant term. The next $k$ columns (i.e., the 2-th -- $(k+1)$-th columns) are the risk prices of the $k$ factors. If intercept = FALSE, lambda_path is a sim_length$\times k$ matrix of the risk prices, without the estimates of $\lambda_c$.
sdf_path: A sim_length$\times t$ matrix of posterior draws of SDFs. Each row represents a draw.
bma_sdf: BMA-SDF.

Arguments

f: A matrix of factors with dimension $t \times k$, where $k$ is the number of factors and $t$ is the number of periods;
R: A matrix of test assets with dimension $t \times N$, where $t$ is the number of periods and $N$ is the number of test assets;
sim_length: The length of monte-carlo simulations;
psi0: The hyper-parameter in the prior distribution of risk prices (see Details);
r: The hyper-parameter related to the prior of risk prices (see Details);
aw: The hyper-parameter related to the prior of $\gamma$ (see Details);
bw: The hyper-parameter related to the prior of $\gamma$ (see Details);
type: If type = 'OLS' (type = 'GLS'), the function returns Bayesian OLS (GLS) estimates of risk prices. The default is 'OLS'.
intercept: If intercept = TRUE (intercept = FALSE), we include (exclude) the common intercept in the cross-sectional regression. The default is intercept = TRUE.

Details

To model the variable selection procedure, we introduce a vector of binary latent variables $\gamma^\top = (\gamma_0,\gamma_1,...,\gamma_K)$, where $\gamma_j \in \{0,1\} $. When $\gamma_j = 1$, factor $j$ (with associated loadings $C_j$) should be included in the model and vice verse.

The continuous spike-and-slab prior of risk prices $\lambda$ is $$ \lambda_j | \gamma_j, \sigma^2 \sim N (0, r(\gamma_j) \psi_j \sigma^2 ) .$$ When the factor $j$ is included, we have $ r(\gamma_j = 1)=1 $. When the factor is excluded from the model, $ r(\gamma_j = 0) =r \ll 1 $. Hence, the Dirac "spike" is replaced by a Gaussian spike, which is extremely concentrated at zero (the default value for $r$ is 0.001). If intercept = TRUE, we choose $ \psi_j = \psi \tilde{\rho}_j^\top \tilde{\rho}_j $, where $ \tilde{\rho}_j = \rho_j - (\frac{1}{N} \Sigma_{i=1}^{N} \rho_{j,i} ) \times 1_N $ is the cross-sectionally demeaned vector of factor $j$'s correlations with asset returns. Instead, if intercept = FALSE, we choose $ \psi_j = \psi \rho_j^\top \rho_j $. In the codes, $\psi$ is equal to the value of psi0.

The prior $\pi (\omega)$ encoded the belief about the sparsity of the true model using the prior distribution $\pi (\gamma_j = 1 | \omega_j) = \omega_j $. Following the literature on the variable selection, we set $$ \pi (\gamma_j = 1 | \omega_j) = \omega_j, \ \ \omega_j \sim Beta(a_\omega, b_\omega) . $$ Different hyperparameters $a_\omega$ and $b_\omega$ determine whether one a priori favors more parsimonious models or not. We choose $a_\omega = 1$ (aw) and $b_\omega=1$ (bw) as the default values.

For each posterior draw of factors' risk prices $\lambda^{(j)}_f$, we can define the SDF as $m^{(j)}_t = 1 - (f_t - \mu_f)^\top \lambda^{(j)}_f$.The Bayesian model averaging of the SDF (BMA-SDF) over $J$ draws is $$m^{bma}_t = \frac{1}{J} \sum^J_{j=1} m^{(j)}_t.$$

References

bryzgalova2023bayesianBayesianFactorZoo

Examples

Run this code


## Load the example data
data("BFactor_zoo_example")
HML <- BFactor_zoo_example$HML
lambda_ols <- BFactor_zoo_example$lambda_ols
R2.ols.true <- BFactor_zoo_example$R2.ols.true
sim_f <- BFactor_zoo_example$sim_f
sim_R <- BFactor_zoo_example$sim_R
uf <- BFactor_zoo_example$uf

## sim_f: simulated strong factor
## uf: simulated useless factor

psi_hat <- psi_to_priorSR(sim_R, cbind(sim_f,uf), priorSR=0.1)
shrinkage <- continuous_ss_sdf(cbind(sim_f,uf), sim_R, 5000, psi0=psi_hat, r=0.001, aw=1, bw=1)
cat("Null hypothesis: lambda =", 0, "for each factor", "\n")
cat("Posterior probabilities of rejecting the above null hypotheses are:",
    colMeans(shrinkage$gamma_path), "\n")

## We also have the posterior draws of SDF: m(t) = 1 - lambda_g %*% (f(t) - mu_f)
sdf_path <- shrinkage$sdf_path

## We also provide the Bayesian model averaging of the SDF (BMA-SDF)
bma_sdf <- shrinkage$bma_sdf

Run the code above in your browser using DataLab