Nonlinear Computation in Deep Linear Networks
We’ve shown that deep linear networks — as implemented using floating-point arithmetic — are not actually linear and can perform nonlinear computation. We used evolution strategies to find parameters in linear networks that exploit this trait, letting us solve non-trivial problems.
Neural networks consist of stacks of a linear layer followed by