Main Algorithm for GMJMCMC (Genetically Modified MJMCMC)
gmjmcmc(
x,
y,
transforms,
P = 10,
N = 100,
N.final = NULL,
probs = NULL,
params = NULL,
loglik.pi = NULL,
loglik.alpha = gaussian.loglik.alpha,
mlpost_params = list(family = "gaussian", beta_prior = list(type = "g-prior")),
intercept = TRUE,
fixed = 0,
sub = FALSE,
verbose = TRUE
)A list containing the following elements:
All models per population.
All models accepted by mjmcmc per population.
All features per population.
Marginal feature probabilities per population.
Marginal feature probabilities per population.
Marginal feature probabilities per population.
Best marginal model probability per population.
Acceptance rate per population.
Overall acceptance rate.
Best marginal model probability throughout the run, represented as the maximum value in unlist(best.margs).
matrix containing the design matrix with data to use in the algorithm
response variable
A character vector including the names of the non-linear functions to be used by the modification and the projection operator.
The number of population iterations for GMJMCMC. The default value is P = 10, which was used in our initial example for illustrative purposes. However, a larger value, such as P = 50, is typically more appropriate for most practical applications.
The number of MJMCMC iterations per population. The default value is N = 100; however, for real applications, a larger value such as N = 1000 or higher is often preferable.
The number of MJMCMC iterations performed for the final population. Per default one has N.final = N, but for practical applications, a much larger value (e.g., N.final = 1000) is recommended. Increasing N.final is particularly important if predictions and inferences are based solely on the last population.
A list of various probability vectors used by GMJMCMC, generated by gen.probs.gmjmcmc.
The key component probs.gen defines probabilities of different operators in the feature generation process.
Defaults typically favor interactions and modifications (0.4 each) over projections and mutations (0.1 each) to encourage interpretable nonlinear features.
A list of various parameter vectors used by GMJMCMC, generated by gen.params.gmjmcmc.
A function specifying the marginal log-posterior of the model up to a constant, including the logarithm of the model prior: \(\log p(M|Y) = \text{const} + \log p(Y|M) + \log p(M)\). Typically assumes a Gaussian model with Zellner's with \(g = max(n,p^2) by default\).
Relevant only if the non-linear projection features depend on parameters \(\alpha\). If \(\alpha\) is estimated, this argument specifies the corresponding marginal log-likelihood. The default method sets all \(\alpha\) to 1 (fastest, but sometimes suboptimal). Alternative estimation strategies ("deep" and "random") are implemented in FBMS.
All parameters for the estimator function loglik.pi
Logical. Whether to include an intercept in the design matrix. Default is TRUE. No variable selection is performed on the intercept.
Integer specifying the number of leading columns in the design matrix to always include in the model. Default is 0.
Logical. If TRUE, uses subsampling or a stochastic approximation approach to the likelihood rather than the full likelihood. Default is FALSE.
Logical. Whether to print messages during execution. Default is TRUE for gmjmcmc and FALSE for the parallel version.
result <- gmjmcmc(y = matrix(rnorm(100), 100),
x = matrix(rnorm(600), 100),
P = 2,
transform = c("p0", "exp_dbl"))
summary(result)
plot(result)
Run the code above in your browser using DataLab