Data Science #34 - The deep learning original paper review, Hinton, Rumelhard & Williams (1985)
On the 34th episode, we review the 1986 paper, "Learning representations by back-propagating errors" , which was pivotal because it provided a clear, generalized framework for training neural networks with internal 'hidden' units. The core of the procedure, back-propagation, repeatedly adjusts the weights of connections in the network to minimize the error between the actual and desired output vectors. Crucially, this process forces the hidden units, whose desired states aren't specified, to develop distributed internal representations of the task domain's important features.This capability to construct useful new features distinguishes back-propagation from earlier, simpler methods like the perceptron-convergence procedure. The authors demonstrate its power on non-trivial problems, such as detecting mirror symmetry in an input vector and storing information about isomorphic family trees. By showing how the network generalizes correctly from one family tree to its Italian equivalent, the paper illustrated the algorithm's ability to capture the underlying structure of the task domain.Despite recognizing that the procedure was not guaranteed to find a global minimum due to local minima in the error-surface , the paper's clear formulation (using equations 1-9 ) and its successful demonstration of learning complex, non-linear representations served as a powerful catalyst. It fundamentally advanced the field of connectionism and became the standard, foundational algorithm used today to train multi-layered networks, or deep learning models, despite the earlier, lesser-known work by Werbos