brnn.extended: brnn.extended

Description

The brnn.extended function fits a two layer neural network as described in MacKay (1992) and Foresee and Hagan (1997). It uses the Nguyen and Widrow algorithm (1990) to assign initial weights and the Gauss-Newton algorithm to perform the optimization. The hidden layer contains two groups of neurons that allow us to assign different prior distributions for two groups of input variables.

Usage

brnn.extended(y,X1,X2,neurons1,neurons2,epochs=1000,mu=0.005,mu_dec=0.1, 
                   mu_inc=10,mu_max=1e10,min_grad=1e-10,change = 0.001,
                   cores=1,verbose =TRUE)

Arguments

Value

A list containing:$theta1A list containing weights and biases. The first $s_1$ components of the list contain vectors with the estimated parameters for the $k$-th neuron, i.e. $(w_k^1, b_k^1, \beta_1^{1[k]},...,\beta_p^{1[k]})'$. $s_1$ corresponds to neurons1 in the argument list.$theta2A list containing weights and biases. The first $s_2$ components of the list contains vectors with the estimated parameters for the $k$-th neuron, i.e. $(w_k^2, b_k^2, \beta_1^{2[k]},...,\beta_q^{2[k]})'$. $s_2$ corresponds to neurons2 in the argument list.$c_aAn estimate of $c_a$$c_dAn estimate of $c_d$$messageString that indicates the stopping criteria for the training process.

deqn

$$F=\beta E_D + \alpha \theta_1' \theta_1 +\delta \theta_2' \theta_2 + \alpha_c c_a^2 + \delta_c c_d^2$$

itemize

$E_D=\sum_{i=1}^n (y_i-\hat y_i)^2$, i.e. the sum of squared errors.

item

$\beta=\frac{1}{2\sigma^2_e}$.

Details

The software fits a two layer network as described in MacKay (1992) and Foresee and Hagan (1997). The model is given by: $y_i=c_a \sum_{k=1}^{s_1} w_k^{1} g_k (b_k^{1} + \sum_{j=1}^p x1_{ij} \beta_j^{1[k]}) + c_d \sum_{k=1}^{s_2} w_k^{2} g_k (b_k^{2} + \sum_{j=1}^q x2_{ij} \beta_j^{2[k]})\,\,e_i, i=1,...,n$

$e_i \sim N(0,\sigma_e^2)$.

$g_k(\cdot)$ is the activation function, in this implementation $g_k(x)=\frac{\exp(x)-\exp(-x)}{\exp(x)+\exp(-x)}$.

References

Foresee, F. D., and M. T. Hagan. 1997. "Gauss-Newton approximation to Bayesian regularization", Proceedings of the 1997 International Joint Conference on Neural Networks. MacKay, D. J. C. 1992. "Bayesian interpolation", Neural Computation, vol. 4, no. 3, pp. 415-447. Nguyen, D. and Widrow, B. 1990. "Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights", Proceedings of the IJCNN, vol. 3, pp. 21-26.

Examples

Run this code

#Load the libraries 

library(brnn)

###############################################################
#Example 5: Additive + Dominant

#Make cluster
cores=4
  
data(Jersey)
y=normalize(pheno$yield_devMilk)
X1=normalize(G)
X2=normalize(D)
  
#Predictive power of the model using the SECOND set for 10 fold CROSS-VALIDATION
index=partitions==2
X1training=X1[!index,]
ytraining=y[!index]
X1testing=X1[index,]
ytesting=y[index]
X2training=X2[!index,]
X2testing=X2[index,]

#Fit the model for the TESTING DATA for Additive + Dominant
out=brnn.extended(y=ytraining,X1=X1training,
                  X2=X2training,neurons1=2,neurons2=2,epochs=2000,cores=cores)
cat("Message: ",out$reason,"\n")

#Plot the results
#Predicted vs observed values for the training set
par(mfrow=c(2,1))
yhat_R_training=out$c_a*predictions.nn(X1training,out$theta1,2)
                +out$c_d*predictions.nn(X2training,out$theta2,2)
plot(ytraining,yhat_R_training,xlab=expression(hat(y)),ylab="y")
cor(ytraining,yhat_R_training)
  
#Predicted vs observed values for the testing set
yhat_R_testing=out$c_a*predictions.nn(X1testing,out$theta1,2)
               +out$c_d*predictions.nn(X2testing,out$theta2,2)
plot(ytesting,yhat_R_testing,xlab=expression(hat(y)),ylab="y")
cor(ytesting,yhat_R_testing)

Run the code above in your browser using DataLab