Demystifying deep learning

If you follow the tech press, you may have noticed that deep learning gets mentioned fairly often lately. What doesn't get much coverage is the history, nature and limitations of deep learning. This article will touch briefly upon those aspects.


The first computational model for neural networks was created by Warren McCulloch and Walter Pitts in 1943. In 1958, Frank Rosenblatt created the perceptron, an ANN (Artifical Neural Network) for pattern detection. The perceptron remains the most common type of ANN. In 1975, Paul J. Werbos invented backpropagation, a method of ANN training that revived interest in AI that remains popular today. The term deep learning was first used about ANNs by Igor Aizenberg in 2000.

Warren McCulloch & Walter Pitts

As you can see, deep learning isn't a new thing. As a field of study, it has been around for decades, and it has seen multiple hype cycles already. What's different this time? Computers have gotten more powerful, and large companies have emerged to invest large sums of money into deep learning research.


Deep learning is fundamentally about mathematics. An ANN is nothing more than a mathematical function with coefficients that are tuned automatically by a computer program until the function produces an approximately correct series of numeric output values for a given set of numeric input values.

Neuron layers

This mathematical function can be broken down into simpler functions called input layers, hidden layers and output layers. In a typical feed-forward ANN, values enter the input layer, pass through one or more hidden layers and exit the output layer.

Forward propagation

Each layer can further be broken down into even simpler functions called neurons. A given neuron is typically connected to every neuron in the next layer. The numerical bias and weight of each connection is set by tunable coefficients. These coefficients make up the memory of the network. Tuning the coefficients is called training or learning.


Training typically happens via backpropagation. During training, the network is activated with a set of training input values, and the computed output values are compared to a corresponding set of training output values. The weights of the connections to each output neuron are adjusted slightly in the opposite direction of the difference, and the same is in turn done to the neurons in the previous layer, and so forth, propagating the adjustments backwards into the network until the input neurons are reached.


As you may have surmised by this point, deep learning isn't magic. It also isn't anywhere near being a simulation of a human brain. If you want to see the state of the art of deep learning, look no further than Google Translate.

Google Translate

Google Translate performs translations by converting words and sentences into coordinates, feeding those coordinates into ANNs and converting the output coordinates back into words and sentences. As anyone who has used Google Translate is painfully aware of, the results are highly variable.

Boiled down to its essentials, deep learning is a powerful set of curve fitting algorithms. However, not every problem can be solved with curve fitting. My prediction for the future is that we will need a paradigm shift away from deep learning before before we will see any major advancements in the field.

Demystifying deep learning