brnn (version 0.8)

brnn_ordinal: brnn_ordinal

Description

The brnn_ordinal function fits a Bayesian Regularized Neural Network for Ordinal data.

Usage

brnn_ordinal(x, …)
  
  # S3 method for formula
brnn_ordinal(formula, data, contrasts=NULL,…)
  
  # S3 method for default
brnn_ordinal(x,
               y,
               neurons=2,
               normalize=TRUE,
               epochs=1000,
               mu=0.005,
               mu_dec=0.1,
               mu_inc=10,
               mu_max=1e10,
               min_grad=1e-10,
               change_F=0.01,
               change_par=0.01,
               iter_EM=1000,
               verbose=FALSE,
               …)

Arguments

formula

A formula of the form y ~ x1 + x2 + …

data

Data frame from which variables specified in formula are preferentially to be taken.

x

(numeric, \(n \times p\)) incidence matrix.

y

(numeric, \(n\)) the response data-vector (NAs not allowed).

neurons

positive integer that indicates the number of neurons.

normalize

logical, if TRUE will normalize inputs and output, the default value is TRUE.

epochs

positive integer, maximum number of epochs(iterations) to train, default 1000.

mu

positive number that controls the behaviour of the Gauss-Newton optimization algorithm, default value 0.005.

mu_dec

positive number, is the mu decrease ratio, default value 0.1.

mu_inc

positive number, is the mu increase ratio, default value 10.

mu_max

maximum mu before training is stopped, strict positive number, default value \(1\times 10^{10}\).

min_grad

minimum gradient.

change_F

the program will stop if the maximum (in absolute value) of the differences of the F function in 3 consecutive iterations is less than this quantity.

change_par

the program will stop iterations of the EM algorithm when the maximum of absolute values of differences between parameters in two consecutive iterations ins less than this quantity.

iter_EM

positive integer, maximum number of iteration for the EM algorithm.

verbose

logical, if TRUE will print iteration history.

contrasts

an optional list of contrasts to be used for some or all of the factors appearing as variables in the model formula.

arguments passed to or from other methods.

Value

object of class "brnn_ordinal". Mostly internal structure, but it is a list containing:

$theta

A list containing weights and biases. The first \(s\) components of the list contains vectors with the estimated parameters for the \(k\)-th neuron, i.e. \((w_k, b_k, \beta_1^{[k]},...,\beta_p^{[k]})'\).

$threshold

A vector with estimates of thresholds.

$alpha

\(\alpha\) parameter.

$gamma

effective number of parameters.

Details

The software fits a Bayesian Regularized Neural Network for Ordinal data. The model is an extension of the two layer network as described in MacKay (1992); Foresee and Hagan (1997), and Gianola et al. (2011). We use the latent variable approach described in Albert and Chib (1993) to model ordinal data, the Expectation maximization (EM) and Levenberg-Marquardt algorithm (Levenberg, 1944; Marquardt, 1963) to fit the model.

Following Albert and Chib (1993), suppose that \(Y_1,...,Y_n\) are observed and \(Y_i\) can take values on \(L\) ordered values. We are interested in modelling the probability \(p_{ij}=P(Y_i=j)\) using the covariates \(x_{i1},...,x_{ip}\). Let

\(g(\boldsymbol{x}_i)= \sum_{k=1}^s w_k g_k (b_k + \sum_{j=1}^p x_{ij} \beta_j^{[k]})\),

where:

  • \(s\) is the number of neurons.

  • \(w_k\) is the weight of the \(k\)-th neuron, \(k=1,...,s\).

  • \(b_k\) is a bias for the \(k\)-th neuron, \(k=1,...,s\).

  • \(\beta_j^{[k]}\) is the weight of the \(j\)-th input to the net, \(j=1,...,p\).

  • \(g_k(\cdot)\) is the activation function, in this implementation \(g_k(x)=\frac{\exp(2x)-1}{\exp(2x)+1}\).

Let

\(Z_i=g(\boldsymbol{x}_i)+e_i\),

where:

  • \(e_i \sim N(0,1)\).

  • \(Z_i\) is an unobserved (latent variable).

The output from the model for latent variable is related to observed data using the approach employed in the probit and logit ordered models, that is \(Y_i=j\) if \(\lambda_{j-1}<Z_i<\lambda_{j}\), where \(\lambda_j\) are a set of unknown thresholds. We assign prior distributions to all unknown quantities (see Albert and Chib, 1993; Gianola et al., 2011) for further details. The Expectation maximization (EM) and Levenberg-Marquardt algorithm (Levenberg, 1944; Marquardt, 1963) to fit the model.

References

Albert J, and S. Chib. 1993. Bayesian Analysis of Binary and Polychotomus Response Data. JASA, 88, 669-679.

Foresee, F. D., and M. T. Hagan. 1997. "Gauss-Newton approximation to Bayesian regularization", Proceedings of the 1997 International Joint Conference on Neural Networks.

Gianola, D. Okut, H., Weigel, K. and Rosa, G. 2011. "Predicting complex quantitative traits with Bayesian neural networks: a case study with Jersey cows and wheat". BMC Genetics, 12,87.

Levenberg, K. 1944. "A method for the solution of certain problems in least squares", Quart. Applied Math., 2, 164-168.

MacKay, D. J. C. 1992. "Bayesian interpolation", Neural Computation, 4(3), 415-447.

Marquardt, D. W. 1963. "An algorithm for least-squares estimation of non-linear parameters". SIAM Journal on Applied Mathematics, 11(2), 431-441.

See Also

predict.brnn_ordinal

Examples

Run this code
# NOT RUN {
# }
# NOT RUN {
#Load the library
library(brnn)

#Load the dataset
data(GLS)

#Subset of data for location Harare
HarareOrd=subset(phenoOrd,Loc=="Harare")

#Eigen value decomposition for GOrdm keep those 
#eigen vectors whose corresponding eigen-vectors are bigger than 1e-10
#and then compute principal components

evd=eigen(GOrd)
evd$vectors=evd$vectors[,evd$value>1e-10]
evd$values=evd$values[evd$values>1e-10]
PC=evd$vectors<!-- %*%sqrt(diag(evd$values)) -->
rownames(PC)=rownames(GOrd)

#Response variable
y=phenoOrd$rating
gid=as.character(phenoOrd$Stock)

Z=model.matrix(~gid-1)
colnames(Z)=gsub("gid","",colnames(Z))

if(any(colnames(Z)!=rownames(PC))) stop("Ordering problem\n")

#Matrix of predictors for Neural net
X=Z%*%PC

#Cross-validation
set.seed(1)
testing=sample(1:length(y),size=as.integer(0.10*length(y)),replace=FALSE)
isNa=(1:length(y)%in%testing)
yTrain=y[!isNa]
XTrain=X[!isNa,]
nTest=sum(isNa)

neurons=2
	
fmOrd=brnn_ordinal(XTrain,yTrain,neurons=neurons,verbose=FALSE)

#Predictions for testing set
XTest=X[isNa,]
predictions=predict(fmOrd,XTest)
predictions


# }

Run the code above in your browser using DataCamp Workspace