Answer (1 of 6): Imagine a network as a sequence of "layers", where each layer is of the form x_{n+1} = f(x_n), where f(x) is a linear transformation followed by a non-linearity such as sigmoid, tanh or relu. The layers operate on 3-D chunks of data, where the first two dimensions are (generally)...