A function that stacks neural networks.
stk(nu, mu)nu %stk% mu
A stacked neural network of \(\nu\) and \(\mu\), i.e. \(\nu \boxminus \mu\)
NOTE: This is different than the one given in Grohs, et. al. 2023. While we use padding to equalize neural networks being parallelized our padding is via the Tun network whereas Grohs et. al. uses repetitive composition of the i network. We use repetitive composition of the \(\mathsf{Id_1}\)
network. See Id
comp
NOTE: The terminology is also different from Grohs et. al. 2023. We call stacking what they call parallelization. This terminology change was inspired by the fact that parallelization implies commutativity but this operation is not quite commutative. It is commutative up to transposition of our input x under instantiation with a continuous activation function.
Also the word parallelization has a lot of baggage when it comes to artificial neural networks in that it often means many different CPUs working together.
Remark: We will use only one symbol for stacking equal and unequal depth neural networks, namely "stk". This is for usability but also that for all practical purposes only the general stacking of neural networks of different sizes is what is needed.
Remark: We have two versions, a prefix and an infix version.
This operation on neural networks, called "parallelization" is found in:
A stacked neural network of nu and mu.
neural network.
neural network.
Grohs, P., Hornung, F., Jentzen, A. et al. Space-time error estimates for deep neural network approximations for differential equations. (2023). https://arxiv.org/abs/1908.03833
And especially in:
' Definition 2.14 in Rafi S., Padgett, J.L., Nakarmi, U. (2024) Towards an Algebraic Framework For Approximating Functions Using Neural Network Polynomials https://arxiv.org/abs/2402.01058
create_nn(c(4,5,6)) |> stk(create_nn(c(6,7)))
create_nn(c(9,1,67)) |> stk(create_nn(c(4,4,4,4,4)))
Run the code above in your browser using DataLab