Artificial Intelligence is the hot topic now a days. You can find many tutorial on web, while most of them only concentrate about outer working model for machine learning mainly the coding and not saying much about the mathematics and science behind it . Its just like a magician showing you his magical tricks and not revealing the secret. This article focuses about what is happening inside a Deep Neural Network and try to untwist the terminology associated with Deep Learning.
How a Neural Model Works To understand the crux of Neural Networks let’s take an example. Suppose we want to model a Logical AND gate using Neural network. The truth table of the AND gate is as follows:-
Using above formula, the Predicted Y for inputs A = 0 and B = 0
Fields of Artificial Intelligence |
Deep Neural Network Deep Learning is a part of Supervised Machine Learning in which we have some training data with Features and Labels (Target). Deep Neural Network was built keeping in mind, how the Human Brain functions.
A Simple Neuron |
Biological Interpretation of Neuron |
The Mathematical Analogy of the Neuron is something like
- input vector X
- weight matrix between input and hidden W
- bias vector on hidden layer B
- activation function on hidden nodes f()
- output of the hidden layer Y
Y = f(Z) = f(X * W + B)
Mathematical Interpretation of a Neuron |
How a Neural Model Works To understand the crux of Neural Networks let’s take an example. Suppose we want to model a Logical AND gate using Neural network. The truth table of the AND gate is as follows:-
Initially our Neural Network does not know anything about what a AND Logic is?
Let’s take some random weights W=[ 1 , 6 , 2] and Bias = +1
- W1=1
- W2=6
- W3=2
- b= 1
Mathematical Representation of Sigmoid Function |
Now the expected output was 0 but we got 0.7310, which is obviously not correct. If we look closely at the equation the only parameter we can control is Weight W, we cannot change either input (A,B) nor output Y. But changing the weight values randomly again does not guarantee the correct output and Brute forcing every combination will be silly.
So, let’s try to minimize the error between our Predicted output and Actual output. Mean Square Error= (0−0.7310)^2/ 2 = 0.2672
Since Mean Square Error is a function which we need to minimize, the differentiation of the function w.r.t Weight variable can give us the ∆W, which we can use to adjust our weights.
∆W = Differentiation of the Error Function w.r.t W1, W2 and W3
W =W + ∆W;
Adjusting the previous Weights is called Backpropagation in Deep Neural Network and now we can say that our model has learned something new by adjusting its weight matrix. Suppose, obtained delta weights are ∆W1 = -2; ∆W2 = -2, ∆W3 =0.
So, our new Weight matrix will be
Let’s try to predict the Output with new weight values using our Equation for output
The Predicted Output is 0.26 which is much better than the previous predicted output. This adjustment of Weight over one complete cycle is called Epoch in Deep Learning. We keep on iterating over the model and keep on calculating ∆W and adjust our weight matrix until the mean squared error is minimized and our predicted output is equivalent to expected output.
Once our model is sufficiently trained the weight matrix becomes W = [-3 ,2, 2]
By setting the threshold of 0.5 for Output Layer we can predict Y as
- Y= 1 for Predicted Y > 0.5
- Y=0 for Predicted Y < 0.5
The above prediction was for A=0 and B=0, Let’s check for other input values also
So after looping around 500 times until the error convergence to minimum the final values received are
No comments:
Post a Comment