Neural Networks Taken Apart: The inner workings of a Neural Network


In the "Let's Build Skynet: Building your 1st Neural Network", we learnt how to build a simple feed forward neural network. We represented everything as a sequence of matrix operations. Our input data was in the form a matrix, our output was also a matrix.

If you went about to break it down, to figure out how the learning actually happens, chances are the back and forth matrix operations have you confused, this can be quite daunting and instills a fear in you that you don't understand the code you've been punching in...


So let's visualize everything properly and then, hopefully we'll get an understanding of how the stuff actually goes about from input to output! Put on your math hats, power up things are gonna get pretty serious from here on! 


Let's begin with our input matrix. It's a 4x3 matrix, which means it has 4 rows and 3 columns...


Our output matrix is a 4x1 matrix of the form...

The weight matrix is a 3x1 matrix...


Now, in the first step, when the input data is fed into the network with the following code:


In terms of a matrix operation, it'd look like this...


The dot product between our input and weight matrices yields the following, we shall call it the output matrix for now...


Now, it comes to the error calculation step, always remember this,
Error = Desired Value - Actual Value
The matrix operation for the code above would be...


Next up is the adjustment of the synaptic weight matrix values...

Let's break this piece by piece, starting with "self.sigmoid_derivative(output)". We calculate the sigmoid derivative of each element in the output matrix, which we obtained above.


Now we've to multiply the matrix obtained above with the matrix containing the error values... (Do not, that this is element wise multiplication!)


Next up is the dot product between the transpose of the input matrix and the matrix obtained above...


The adjustments are made to the weight matrix as follows...


Woah, that was a lot of math over there 😰, blink and you'll miss it!


But, hang on there's just one more equation! And this is what helps you make predictions, so here goes. When new input values are passed, that matrix is operated with the adjusted weight values after the training iterations...


That's all there is to it. Quite simple isn't it when you think about it! It's advised to write down the matrix operations as you build your networks, this'll not only give you an understanding of how stuff works inside your network, but serves as a good way to troubleshoot errors.

Now you've an idea of what you're doing when you tweak something in your network, you can know if you're making it more computationally expensive or more complex for no good returns...

So let the tweaking begin, and share your results with me 😁!



Comments