BGLR: Bayesian Generalized Linear Regression

Description

The BGLR (`Bayesian Generalized Linear Regression') function fits various types of parametric and semi-parametric Bayesian regressions to continuos (censored or not), binary and ordinal outcomes.

Usage

BGLR(y, response_type = "gaussian", a=NULL, b=NULL,ETA = NULL, nIter = 1500,
       burnIn = 500, thin = 5, saveAt = "", S0 = NULL, 
       df0 =5, R2 = 0.5, weights = NULL,
       verbose = TRUE, rmExistingFiles = TRUE)

Arguments

(numeric, $n$) the data-vector (NAs allowed).

response_type

(string) admits values "gausian" or "ordinal". The Gaussian outcome may be censored or not (see below). If response_type="gaussian", y should be coercible to numeric. If r

a,b

(numeric, $n$) only requiered for censored outcomes, a and b are vectors specifying lower and upper bounds for censored observations, respectively. The default value, for non-censored and ordinal o

ETA

(list) This is a two-level list used to specify the regression function (or linear predictor). By default the linear predictor (the conditional expectation function in case of Gaussian outcomes) includes only an intercept. R

weights

(numeric, $n$) a vector of weights, may be NULL. If weights is not NULL, the residual variance of each data-point is set to be proportional to the square of the weight. Only used with Gaussi

nIter,burnIn, thin

(integer) the number of iterations, burn-in and thinning.

saveAt

(string) this may include a path and a pre-fix that will be added to the name of the files that are saved as the program runs.

S0, df0

(numeric) The scale parameter for the scaled inverse-chi squared prior assigned to the residual variance, only used with Gaussian outcomes. In the parameterization of the scaled-inverse chi square in BGLR the expected values is S

(numeric,

0) The proportion of variance that one expects, a priori, to be explained by the regression. Only used if 
             the hyper-parameters are not specified; if that is the case, internaly, hyper-paramters are set so that th

verbose

(logical) if TRUE the iteration history is printed, default TRUE.

rmExistingFiles

(logical) if TRUE removes existing output files from previous runs, default TRUE.

Value

A list with estimated posterior means, estimated posterior standard deviations, and the parameters used to fit the model. See the vignettes in the package (or http://genomics.cimmyt.org/BGLR-extdoc.pdf) for further details.

Details

BGLR implements a Gibbs sampler for a Bayesian regresion model. The linear predictor (or regression function) includes an intercept (introduced by default) plus a number of user-specified regression components (X) and random effects (u), that is: html{

ETA.png}{{$$\eta=1\mu +X_1 \beta_1+...+X_p \beta_p +u_1+...+u_q$$}} The components of the linear predictor are specified in the argument ETA (see above). The user can specify as many linear terms as desired, and for each component the user can choose the prior density to be assigned. The distribution of the response is modeled as a function of the linear predictor. For Gaussian outcomes, the linear predictor is the conditional expectation, and censoring is allowed. For censored data points the actual response value ($y_i$) is missing, and the entries of the vectors a and b (see above) give the lower an upper vound for $y_i$. The following table shows the configuration of the triplet (y, a, b) for un-censored, right-censored, left-censored and interval censored. lccc{ a y b Un-censored NULL $y_i$ NULL Right censored $a_i$ NA $\infty$ Left censored $-\infty$ NA $b_i$ Interval censored $a_i$ NA $b_i$ } Internally, censoring is dealt with as a missing data problem. Ordinal outcomes are modelled using the probit link, implemented via data augmentation. In this case the linear predictor becomes the mean of the underlying liability variable which is normal with mean equal to the linear predictor and variance equal to one. In case of only two classes (binary outcome) the threshold is set equal to zero, for more than two classess thresholds are estimated from the data. Further details about this approach can be found in Albert and Chib (1993).

References

Albert J,. S. Chib. 1993. Bayesian Analysis of Binary and Polychotomus Response Data. JASA, 88: 669-679. de los Campos G., H. Naya, D. Gianola, J. Crossa, A. Legarra, E. Manfredi, K. Weigel and J. Cotes. 2009. Predicting Quantitative Traits with Regression Models for Dense Molecular Markers and Pedigree. Genetics 182: 375-385. de los Campos, G., D. Gianola, G. J. M., Rosa, K. A., Weigel, and J. Crossa. 2010. Semi-parametric genomic-enabled prediction of genetic values using reproducing kernel Hilbert spaces methods. Genetics Research, 92:295-308. Park T. and G. Casella. 2008. The Bayesian LASSO. Journal of the American Statistical Association 103: 681-686. Spiegelhalter, D.J., N.G. Best, B.P. Carlin and A. van der Linde. 2002. Bayesian measures of model complexity and fit (with discussion). Journal of the Royal Statistical Society, Series B (Statistical Methodology) 64 (4): 583-639.

Examples

Run this code

#Demos
library(BGLR)

#BayesA
demo(BA)

#BayesB
demo(BB)

#Bayesian LASSO
demo(BL)

#Bayesian Ridge Regression
demo(BRR)

#BayesCpi
demo(BayesCpi)

#RKHS
demo(RKHS)

#Binary traits
demo(Bernoulli)

#Ordinal traits
demo(ordinal)

#Censored traits
demo(censored)

Run the code above in your browser using DataLab