Performs a batch matrix-matrix product of matrices stored
in batch1 and batch2,
with a reduced add step (all matrix multiplications get accumulated
along the first dimension).
input is added to the final result.
batch1 and batch2 must be 3-D tensors each containing the
same number of matrices.
If batch1 is a \((b \times n \times m)\) tensor, batch2 is a
\((b \times m \times p)\) tensor, input must be
broadcastable  with a \((n \times p)\) tensor
and out will be a \((n \times p)\) tensor.
$$
    out = \beta\ \mbox{input} + \alpha\ (\sum_{i=0}^{b-1} \mbox{batch1}_i \mathbin{@} \mbox{batch2}_i)
$$
For inputs of type FloatTensor or DoubleTensor, arguments beta and alpha
must be real numbers, otherwise they should be integers.