Post

MLP

MLP

MLP’s Definition

A Multi-Layer Perceptron (MLP) is a type of artificial neural network with multiple layers of neurons, using activation functions and backpropagation for tasks like classification and regression.

Note For Code

  1. Automatic Differentiation Framework (Value class):
    • Basic mathematical operations (+, -, *, /, **) and activation functions (exp, tanh)
    • The _backward in every step
    • Each Value object tracks its data (data), gradient (grad), operation (_op), and its child nodes (_prev)
    • Topological sorting to support the backward propagation(Reverse Needed)
  2. Neural Network Components:
    • Neuron: Represents a single neuron with randomly initialized weights and biases, using the Value class for parameters
      • Caculate the tanh activation when finish the basic operation
    • Layer: Aggregates multiple neurons, handling forward propagation across neurons
    • MLP: multiple Layers
  3. More details:
    • When a point is used repeatedly, its grad should be cumulative
    • out.grad and out.data representing the result of forward pass and the gradient in back pass

Note for Forwardpropagation and Backpropagation

  • Forwardpropagation From Input, calculate the value in each step, and generate the exactly number
  • Backpropagation Make sure every variables’ effect level to loss
    • Tool: Chain law

Example to calculate

  • out.data is the numerical result of the forward calculation.
  • out.grad is the gradient of the subsequent node (loss function) to the current output.
    Scatter Plot Image Scatter Plot Image

Code To Make(tanh)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# Simulate Automatic differentiation framework
import math
import random
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
class Value:
    def __init__(self, data, _children=(), _op='', label=''):
        self.data = data
        self.grad = 0.0
        self._backward = lambda: None 
        self._prev = set(_children)
        self._op = _op
        self.label = label
        
    def __repr__(self):
        return f"Value(data={self.data}, grad={self.grad})"
    
    def __add__(self, other):
        other = other if isinstance(other, Value) else Value(other)
        out = Value(self.data + other.data, (self,other),'+')
        def _backward():
            self.grad += 1.0 * out.grad
            other.grad += 1.0 * out.grad
        out._backward = _backward
        return out
        
    def __radd__(self, other):
        return self + other
    
    def __mul__(self,other):
        other = other if isinstance(other, Value) else Value(other)
        out = Value(self.data * other.data, (self, other), '*')
        def _backward():
            self.grad += other.data * out.grad
            other.grad += self.data * out.grad
        out._backward = _backward
        return out
    
    def __rmul__(self,other):
        return self*other
    
    def __neg__(self):
        return -1* self
    
    def __sub__(self,other):
        other = other if isinstance(other, Value) else Value(other)
        out = self + (-other)
        return out
    
    def __pow__(self,other):
        assert isinstance(other, (int, float)), "only supporting int/float powers for now"
        out = Value(self.data**other, (self,), f'**{other}')
        return out
    
    def __truediv__(self,other):
        other = other if isinstance(other, Value) else Value(other)
        return self * other**-1
            
    def exp(self):
        x = self.data
        out = Value(math.exp(x), (self, ), 'exp')
        def _backward():
            self.grad += out.grad * out.data
            
    def tanh(self):
        x = self.data
        t = (math.exp(2*x) - 1)/(math.exp(2*x) + 1)
        out = Value(t, (self, ), 'tanh')
        
        def _backward():
            self.grad += (1 - t**2) * out.grad
            out._backward = _backward
        return out
    
    def backward(self):
        topo = []
        visited = set()
        def build_topo(v):
            if v not in visited:
                visited.add(v)
                for child in v._prev:
                    build_topo(child)
            topo.append(v)
                
        build_topo(self)
        
        self.grad = 1.0
        for node in reversed(topo):
            node._backward()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Calling Framework to simulate the training procedure
import torch
class Neuron:
    def __init__(self, nin):
        self.w = [Value(random.uniform(-1,1)) for _ in range(nin)]
        self.b = Value(random.uniform(-1,1))
    def __call__(self, x):
        act = sum((wi * xi for wi, xi in zip(self.w, x)), self.b)
        out = act.tanh()
        return out
    def parameters(self):
        return self.w + [self.b]
    
class Layer:
    def __init__(self, nin, nout):
        self.neurons = [Neuron(nin) for _ in range(nout)]
    def __call__(self, x):
        outs = [n(x) for n in self.neurons]
        return outs[0] if len(outs) == 1 else outs
    def parameters(self):
        return [p for n in self.neurons for p in n.parameters()]
    
class MLP:
    def __init__(self, nin, nouts):
        sz = [nin] + nouts
        self.layers = [Layer(sz[i], sz[i+1]) for i in range(len(nouts))]
    def __call__(self,x):
        for layer in self.layers:
            x = layer(x)
        return x
    def parameters(self):
        return [p for layer in self.layers for p in layer.parameters()]
1
2
3
4
# Pre-definition
x = [2.0, 3.0, -1.0]
n = MLP(3, [4, 4, 1])
n(x)
1
2
3
4
5
6
7
8
9
# Input
xs = [
  [2.0, 3.0, -1.0],
  [3.0, -1.0, 0.5],
  [0.5, 1.0, 1.0],
  [1.0, 1.0, -1.0],
]
ys = [1.0, -1.0, -1.0, 1.0] # desired targets

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Training Procedure
for k in range(20):
  
  # forward pass
  ypred = [n(x) for x in xs]
  loss = sum((yout - ygt)**2 for ygt, yout in zip(ys, ypred))
  
  # backward pass
  for p in n.parameters():
    p.grad = 0.0
  loss.backward()
  
  # update
  for p in n.parameters():
    p.data += -0.1 * p.grad
  
  print(k, loss.data)
  

Tanh - Activation Function

Scatter Plot Image

To make the boundary more flexible which enables more difficult task

This post is licensed under CC BY 4.0 by the author.