netgwas (version 0.0.1-1)

selectnet: Model selection

Description

Estimate the optimal regularization parameter at EM convergence based on different information criteria .

Usage

selectnet(netgwas.obj, opt.index= NULL, criteria= NULL, ebic.gamma=0.5, 
		   ncores= NULL, verbose= TRUE)

Arguments

netgwas.obj

An object with S3 class "netgwas"

opt.index

The program internally determines an optimal graph using opt.index= NULL. Otherwise, to manually choose an optimal graph from the graph path.

criteria

Model selection criteria. "ebic" and "aic" are available. BIC model selection can be calculated by fixing ebic.gamma = 0. Applicable only if opt.index= NULL.

ebic.gamma

The tuning parameter for ebic. Theebic.gamma = 0 results in bic model selection. The default value is 0.5. Applicable only opt.index= NULL.

ncores

The number of cores to use for the calculations. Using ncores = "all" automatically detects number of available cores and runs the computations in parallel.

verbose

If verbose = FALSE, printing information is disabled. The default value is TRUE. Applicable only opt.index= NULL.

Value

An obj with S3 class "selectnet" is returned:

opt.adj

The optimal graph selected from the graph path

opt.theta

The optimal precision matrix from the graph path

opt.sigma

The optimal covariance matrix from the graph path

ebic.scores

Extended BIC scores for regularization parameter selection at the EM convergence. Applicable if opt.index = NULL.

opt.index

The index of optimal regularization parameter.

opt.rho

The selected regularization parameter.

par.cor

A partial correlation matrix.

and anything else that is included in the input netgwas.obj.

Details

This function computes extended Bayesian information criteria (ebic), Bayesian information criteria, Akaike information criterion (aic) at EM convergence based on observed or joint log-likelihood. The observed log-likelihood can be obtained through

$$\ell_Y(\widehat{\Theta}_\lambda) = Q(\widehat{\Theta}_\lambda | \widehat{\Theta}^{(m)}) - H (\widehat{\Theta}_\lambda | \widehat{\Theta}^{(m)}),$$

Where \(Q\) can be calculated from netmap, netsnp, netphenogeno function and H function is $$H(\widehat{\Theta}_\lambda | \widehat{\Theta}^{(m)}_\lambda) = E_z[\ell_{Z | Y}(\widehat{\Theta}_\lambda) | Y; \widehat{\Theta}_\lambda] = E_z[\log f(z)| Y ;\widehat{\Theta}_\lambda ] - \log p(y).$$

The "ebic" and "aic" model selection criteria can be obtained as follow $$ebic(\lambda) = -2 \ell(\widehat{\Theta}_\lambda) + ( \log n + 4 \gamma \log p) df(\lambda)$$

$$aic(\lambda) = -2 \ell(\widehat{\Theta}_\lambda) + 2 df(\lambda)$$ where \(df\) refers to the number of non-zeros offdiagonal elements of \(\hat{\Theta}_\lambda\), and \(\gamma \in [0, 1]\). Typical value for for ebic.gamma is 1/2, but it can also be tuned by experience. Fixing ebic.gamma = 0 results in bic model selection.

References

1. Behrouzi, P., and Wit, E. C. (2017a). Detecting Epistatic Selection with Partially Observed Genotype Data Using Copula Graphical Models. arXiv preprint, arXiv:1710.00894. 2. Behrouzi, P., and Wit, E. C. (2017c). netgwas: An R Package for Network-Based Genome-Wide Association Studies. arXiv preprint, arXiv:1710.01236. 3. Ibrahim, Joseph G., Hongtu Zhu, and Niansheng Tang. (2012). Model selection criteria for missing-data problems using the EM algorithm. Journal of the American Statistical Association. 4. D. Witten and J. Friedman. (2011). New insights and faster computations for the graphical lasso. Journal of Computational and Graphical Statistics, to appear. 5. J. Friedman, T. Hastie and R. Tibshirani. (2007). Sparse inverse covariance estimation with the lasso, Biostatistics. 6. Foygel, R. and M. Drton. (2010). Extended bayesian information criteria for Gaussian graphical models. In Advances in Neural Information Processing Systems, pp. 604-612.

See Also

netmap, netsnp, netphenogeno

Examples

Run this code
# NOT RUN {
	
# }
# NOT RUN {
	
# }
# NOT RUN {
		#simulate data
		D <- simgeno(p=50, n=100, k= 3, adjacent = 3, alpha = 0.06 , beta = 0.06)
		plot(D)

		#explore intra- and inter-chromosomal interactions
		out  <-  netsnp(y=D$data, n.rho= 5, ncores= 1)
		plot(out)

		#different graph selection methods
		sel.ebic1 <- selectnet(out, criteria = "ebic")
		plot(sel.ebic1)

		sel.ebic2 <- selectnet(out, criteria = "ebic", loglik = TRUE)
		plot(sel.ebic2)

		sel.aic <- selectnet(out, criteria = "aic")
		plot(sel.aic)

		sel.bic <- selectnet(out, criteria = "ebic", ebic.gamma = 0)
		plot(sel.bic)
	
# }

Run the code above in your browser using DataLab