Dynamics of learning and forgetting in simple neural networks
Dynamics of learning and forgetting in simple neural networks
Artificial neural networks are the foundation of modern artificial intelligence, capable of performing at superhuman levels in tasks ranging from facial recognition to games such as Go. While much is known empirically about how such networks iteratively learn to perform tasks, a complete mathematical characterization of such learning has been more elusive. In this talk, I will develop such a mathematical description of learning for the building block of artificial neural networks, the perceptron, a simple model of a biological neuron which mathematically corresponds to a nonlinear function composed with an affine transformation. Training a neural network to perform a task corresponds to updating the parameters in this affine transformation (“learning the weights”) in order to produce a desired output. Modeling learning as a stochastic process, I will use this approach to derive dynamical equations describing the evolution of the weights over time. This approach makes it possible to address how the time course of learning depends on the task structure, the algorithm used to implement learning, and the sequential learning of subsequent tasks that may overwrite previous learning.