There is too much talk in the tech community concerning the trends under Machine learning and Artificial intelligence. Everyone wants to be part of the revolution that will come from investment in AI and Machine learning. It has gone to the extent that these terms are now a blasé to many. Most of us use these terms without really understanding what they mean and the related impact expected from the investment made.
I will try to give an insight into one particular area where most of us have an idea but we don’t understand what it entails, that’s deep learning.
To understand deep learning we first need to define machine learning and AI. Put simply, AI is the replication of human intelligence into machines and machine learning is the ability to train computers to learn with a given set of data. This is further broken down into supervised learning and unsupervised learning.
Supervised learning is where you train a machine by giving it sets of data inputs and also define the expected output it should give. In case the output does not add up the machine readjusts its analysis until it gets the right answer. This is done iteratively until a perfect match is gotten from the output. Unsupervised learning is the opposite of supervised learning, you feed the machine a set of data inputs and let make logical classification and predictive models based on the inputs given.
So what is deep learning?
Deep learning is a machine learning method where the machine is trained with a given set of data, but there is an underlying layer one or more to analyze the inputs and give the best output. It functions more like how the brain works where information is processed through neural nerves. Similarly in deep learning, the hidden layers are referred to as neural networks. Let’s dissect the process below.
Input Layer – This is the first point where the set of data is received in the model, in the image above we have 3 neural networks receiving the data input. The input layer analyses the data and then passes it to the next layer
Hidden Layer –This is the layer that performs mathematical computations based on the sets of inputs received, it analyses and weighs the different options before transferring this to the next layer. This is the holy grail of deep learning. The term deep refers to the number of hidden layers one can have in the model. It is a replication of how the brain processes information by passing it through a set of neurons.
Output Layer – This is the layer that gives us the output of the analysis made within the different layers based on the input.
To better understand how the computation works, each connection within the neural network has an associated weight assigned to it. The weight gives the importance or the value the data point brings into the equation.
Training the systems is a bit complex mainly due to the mathematical computations involved. Each neuron is assigned an activation function that standardizes the outputs from each of the neural networks. When training the systems we tend to compare the outputs from the system with the expected output from the data sets that were initially fed into the model. The diversion from the expected output has a term assigned called the cost function. For our model to work effectively the cost function must be zero. This is where the different weights within the neural networks are adjusted until the cost function becomes zero.
See the breakdown below:
In weather-predicting AI models, the machine is fed historical data set which can then be used to predict the weather. In this case, for example, our inputs can be pressure, humidity, wind speed and temperature.
These are then given different weights and fed into the hidden layer for further analysis to give us an output that we can then compare with the expected output. If the cost function is more than zero we’ll have to adjust the different weights until we get the expected output.
AI might help predict future weather patterns