One Liner Definition
Neural Network is a mathematical function.
This mathematical function is based on inputs and parameters.
Inputs and parameters are multiplied and added them up. Negative values are set to zero.
These operations are repeated untill the error of prediction is minimized.
Motivation
These 3 simple steps are the foundation of any deep learning model.
Implicitally they touch most important parts of NN: 1. Inputs and parameters are multiplied and added them up => Matrix multiplication; 2. Negative values are set to zero => Rectified Linear function; 3. Operations repeated untill error of prediction is minimized => Gradient descent on Loss function.
The most complex deep learning model is built on these foundamentals. Deeply understanding them will help to breaking every complex model out.
Implementation
def f(x): return 3*x**2 + 2*x + 1
def quad(a, b, c, x): return a*x**2 + b*x + c
def noise(x, scale): return np.random.normal(scale=scale, size=x.shape)
def add_noise(x, mult, add): return x * (1+noise(x,mult)) + noise(x,add)
import numpy as np
import torch
42)
np.random.seed(
= torch.linspace(-2, 2, steps=20)[:,None]
x = add_noise(f(x), 0.15, 1.5) y
from functools import partial
def mk_quad(a, b, c): return partial(quad, a, b, c)
def mae(pred, actual): return torch.abs(pred-actual).mean()
def quad_mae(params):
= mk_quad(*params)
f return mae(f(x), y)
import torch
= torch.Tensor([1.1, 1.1, 1.1]) abc
abc.requires_grad_()
= quad_mae(abc) loss
for i in range(10):
= quad_mae(abc)
loss
loss.backward()with torch.no_grad(): abc -= abc.grad*0.01
print(f'step={i}; loss={loss:.2f}')
step=0; loss=2.42
step=1; loss=2.40
step=2; loss=2.36
step=3; loss=2.30
step=4; loss=2.21
step=5; loss=2.11
step=6; loss=1.98
step=7; loss=1.85
step=8; loss=1.72
step=9; loss=1.58
Take Away
- Matrix multiplication is the key to quickly calculate multuplication and addition of inputs and parameters.
- Gradient descent is the tool used to understand how to minimize the loss function, since loss function composed by parameters
abc
. - Rectified Linear function, known as ReLU, is a linear function which takes input and the ouput is equals to the input. If the input is negative the output is zero. It’s defined as: \(f(x) = max(0, x)\)
Further Work
- Ankify:
- Matrix multiplication
- Gradient descent
- ReLU
- End to end GD which every DL model is based on
- Develop a Neural Network from scratch