SEMdnn: Layer-wise SEM train with a Deep Neural Netwok (DNN)

Description

The function builds the topological layer (TL) ordering of the input graph to fit a series of Deep Neural Networks (DNN) models, where the nodes in one layer act as response variables (output) y and the nodes in the sucessive layers act as predictors (input) x. Each fit uses the dnn function of the cito R package, based on the deep learning framework 'torch'.

The torch package is native to R, so it's computationally efficient and the installation is very simple, as there is no need to install Python or any other API, and DNNs can be trained on CPU, GPU and MacOS GPUs. In order to install torch please follow these steps:

install.packages("torch")

library(torch)

install_torch(reinstall = TRUE)

For setup GPU or if you have problems installing torch package, check out the installation help from the torch developer.

Usage

SEMdnn(
  graph,
  data,
  train = NULL,
  cowt = FALSE,
  thr = NULL,
  loss = "mse",
  hidden = c(10L, 10L, 10L),
  link = "relu",
  validation = 0,
  bias = TRUE,
  lambda = 0,
  alpha = 0.5,
  dropout = 0,
  optimizer = "adam",
  lr = 0.01,
  epochs = 100,
  device = "cpu",
  early_stopping = FALSE,
  verbose = TRUE,
  ...
)

Value

An S3 object of class "DNN" is returned. It is a list of 5 objects:

"fit", a list of DNN model objects, including: the estimated covariance matrix (Sigma), the estimated model errors (Psi), the fitting indices (fitIdx), and the estimated connection weights (parameterEstimates), if cowt=TRUE.
"Yhat", the matrix of continuous predicted values of graph nodes (excluding source nodes) based on training samples.
"model", a list of all j=1,...,(L-1) fitted MLP network models.
"graph", the induced DAG of the input graph mapped on data variables. If cowt=TRUE, the DAG is colored based on the estimated connection weights, if abs(W) > thr and W < 0, the edge is inhibited and it is highlighted in blue; otherwise, if abs(W) > thr and W > 0, the edge is activated and it is highlighted in red.
"data", standardized training data subset mapping graph nodes.

Arguments

graph: An igraph object.
data: A matrix with rows corresponding to subjects, and columns to graph nodes (variables).
train: A numeric vector specifying the row indices corresponding to the train dataset.
cowt: A logical value. If cowt=TRUE connection weights of the input node (variables) are computing (default = FALSE).
thr: A numerical value indicating the threshold to apply on the absolute values of the connection matrix to color the graph (default = NULL).
loss: A character value specifying the loss after which network should be optimized. The user can specify: (a) "mse" (mean squared error), "mae" (mean absolute error), or "gaussian" (normal likelihood), for regression problems; (b) "poisson" (poisson likelihood), or "nbinom" (negative binomial likelihood) for regression with count data; (c) "binomial" (binomial likelihood) for binary classification problems; (d) "softmax" or "cross-entropy" for multi-class classification (default = "mse").
hidden: hidden units in layers; the number of layers corresponds with the length of the hidden units. As a default, hidden = c(10L, 10L, 10L).
link: A character value describing the activation function to use, which might be a single length or be a vector with many activation functions assigned to each layer (default = "relu").
validation: A numerical value indicating the proportion of the data set that should be used as a validation set (randomly selected, default = 0).
bias: A logical vector, indicating whether to employ biases in the layers (bias = TRUE), which can be either vectors of logicals for each layer (number of hidden layers + 1 (final layer)) or of length one (default = TRUE).
lambda: A numerical value indicating the strength of regularization: lambda penalty, \(\lambda * (L1 + L2)\) (default = 0).
alpha: A numerical vector to add L1/L2 regularization into the training. Set the alpha parameter for each layer to \((1 - \alpha) * ||weights||_1 + \alpha ||weights||^2\). It must fall between 0 and 1 (default = 0.5).
dropout: A numerical value for the dropout rate, which is the probability that a node will be excluded from training (default = 0).
optimizer: A character value indicating the optimizer to use for training the network. The user can specify: "adam" (ADAM algorithm), "adagrad" (adaptive gradient algorithm), "rmsprop" (root mean squared propagation), "rprop” (resilient backpropagation), "sgd" (stochastic gradient descent). As a default, optimizer = “adam”.
lr: A numerical value indicating the learning rate given to the optimizer (default = 0.01).
epochs: A numerical value indicating the epochs during which the training is conducted (default = 100).
device: A character value describing the CPU/GPU device ("cpu", "cuda", "mps") on which the neural network should be trained on (default = "cpu")
early_stopping: If set to integer, training will terminate if the loss increases over a predetermined number of consecutive epochs and apply validation loss when available. Default is FALSE, no early stopping is applied.
verbose: If verbose = TRUE, the training curves of the DNN models are displayed as output, comparing the training, validation and baseline curves in terms of loss (y) against the number of epochs (x) (default = TRUE).
...: Currently ignored.

Author

Mario Grassi mario.grassi@unipv.it

Details

By mapping data onto the input graph, SEMdnn() creates a set of DNN models based on the topological layer (j=1,…,L) structure of the input graph. In each iteration, the response (output) variables, y are the nodes in the j=1,...,(L-1) layer and the predictor (input) variables, x are the nodes belonging to the successive, (j+1),...,L layers. Each DNN model is a Multilayer Perceptron (MLP) network, where every neuron node is connected to every other neuron node in the hidden layer above and every other hidden layer below. Each neuron's value is determined by calculating a weighted summation of its outputs from the hidden layer before it, and then applying an activation function. The calculated value of every neuron is used as the input for the neurons in the layer below it, until the output layer is reached.

References

Amesöder, C., Hartig, F. and Pichler, M. (2024), ‘cito': an R package for training neural networks using ‘torch'. Ecography, 2024: e07143. https://doi.org/10.1111/ecog.07143

Grassi M, Palluzzi F, Tarantino B (2022). SEMgraph: An R Package for Causal Network Analysis of High-Throughput Data with Structural Equation Models. Bioinformatics, 38 (20), 4829–4830 <https://doi.org/10.1093/bioinformatics/btac567>

Examples

Run this code


# \donttest{
if (torch::torch_is_installed()){

# load ALS data
ig<- alsData$graph
data<- alsData$exprs
data<- transformData(data)$data
group<- alsData$group

#...with train-test (0.5-0.5) samples
set.seed(123)
train<- sample(1:nrow(data), 0.5*nrow(data))

start<- Sys.time()
dnn0 <- SEMdnn(ig, data, train, cowt = TRUE, thr = NULL,
			#loss = "mse", hidden = 5*K, link = "selu",
			loss = "mse", hidden = c(10, 10, 10), link = "selu",
			validation = 0, bias = TRUE, lr = 0.01,
			epochs = 32, device = "cpu", verbose = TRUE)
end<- Sys.time()
print(end-start)

#str(dnn0, max.level=2)
dnn0$fit$fitIdx
dnn0$fit$parameterEstimates
gplot(dnn0$graph)
table(E(dnn0$graph)$color)

#...with a binary outcome (1=case, 0=control)

ig1<- mapGraph(ig, type="outcome"); gplot(ig1)
outcome<- ifelse(group == 0, -1, 1); table(outcome)
data1<- cbind(outcome, data); data1[1:5,1:5]

start<- Sys.time()
dnn1 <- SEMdnn(ig1, data1, train, cowt = TRUE, thr = NULL,
			#loss = "mse", hidden = 5*K, link = "selu",
			loss = "mse", hidden = c(10, 10, 10), link = "selu",
			validation = 0, bias = TRUE, lr = 0.01,
			epochs = 32, device = "cpu", verbose = TRUE)
end<- Sys.time()
print(end-start)

#str(dnn1, max.level=2)
dnn1$fit$fitIdx
dnn1$fit$parameterEstimates
gplot(dnn1$graph) 
table(E(dnn1$graph)$color)

#...with input -> hidden structure -> output :
# source nodes -> graph layer structure -> sink nodes

#Topological layer (TL) ordering
K<- c(12,  5,  3,  2,  1,  8)
K<- rev(K[-c(1,length(K))]);K

ig2<- mapGraph(ig, type="source"); gplot(ig2)

start<- Sys.time()
dnn2 <- SEMdnn(ig2, data, train, cowt = TRUE, thr = NULL,
			loss = "mse", hidden = 5*K, link = "selu",
			#loss = "mse", hidden = c(10, 10, 10), link = "selu",
			validation = 0, bias = TRUE, lr = 0.01,
			epochs = 32, device = "cpu", verbose = TRUE)
end<- Sys.time()
print(end-start)

#Visualization of the neural network structure
nplot(dnn2$model[[1]], bias=FALSE)

#str(dnn2, max.level=2)
dnn2$fit$fitIdx
mean(dnn2$fit$Psi)
dnn2$fit$parameterEstimates
gplot(dnn2$graph)
table(E(dnn2$graph)$color)
}
# }

Run the code above in your browser using DataLab