-By Jonathan Frankle† MIT CSAIL David J. Schwab CUNY ITS and Ari S. Morcos of Facebook AI Research Paper link Many important aspects of neural network learning take place within the very earliest iterations or epochs of training. For example, Sparse Trainable sub-networks emerge Gradient descent moves into a small subspace Network undergoes a critical period Researchers examine the changes that deep neural networks undergo during this early phase of training. Over the past decade, methods for successfully training big, deep neural networks have revolutionized machine learning. Yet surprisingly, the underlying reasons for the success of these approaches remain poorly understood, despite remarkable empirical performance. A large body of work has focused on understanding what happens during the later stages of training, while the initial phase has been less explored. Research is built ...