Now that you understand the basics of perceptron, let’s look at the iterative solution suggested by Rosenblatt for training the perceptron.
To summarise, Rosenblatt suggested an elegant iterative solution to train the perceptron (i.e. to learn the weights):
where yit.xit is the error term. It is important to note here that xit in this iterative procedure is a misclassified data point and yit is the corresponding true label. Also, note that the dot in yit.xit is not a dot product. Let’s try to understand the intuition behind this with an example.
Consider the following figure with 6 data points and the separator. These are represented by numbers on a scale. The blue points belong to class ‘-1’ and the orange points belong to the class ‘+1’.
The coordinates of the points and the labels are given in the following table:
Data Points | x1 | x2 | True Label(y) | Homogeneous coordinates |
0 | 1 | 0 | 1 | (1,0,1) |
1 | 3 | 1 | 1 | (3,1,1) |
2 | 4 | 2 | 1 | (4,2,1) |
3 | 0 | 1 | -1 | (0,1,1) |
4 | 1 | 6 | -1 | (1,6,1) |
5 | 2 | 4 | -1 | (2,4,1) |
Please note that the last column has the homogeneous coordinates of the data points.
The initial classifier is (3, -1, 0) which when expressed algebraically is 3×1−1×2=0.
Let’s start with the first iteration. The misclassified data point is the data point ‘5’: (2,4,1).
Hence in the formula wt+1=wt+ yit.xit, xit is (2,4,1) and yit is the true label ‘-1’.
We get w1= ⎡⎢⎣3−10⎤⎥⎦+ (-1)*⎡⎢⎣241⎤⎥⎦ = ⎡⎢⎣1−5−1⎤⎥⎦
w1= (1, -5, -1) which is 1×1−5×2=−1 shown in the figure below.
You have seen how we performed the 2nd iteration to get w1. Notice that the line moves in the right direction, though it misclassifies two orange points now (and passes through one).
Now answer the following questions to get w2.
The w2 we get is ⎡⎢⎣5−30⎤⎥⎦, algebraically expressed as 5×1−3×2=0 is shown in the following image. Note that it classifies all the data points correctly.
This is a simple way to understand the intuition behind the algorithm. You can go through the mathematics of the proof in the additional reading section.
You have seen how a perceptron performs binary classification but wouldn’t it be amazing if these simple devices could do something more complex? Let’s see how a group of perceptrons can do multiclass classification in the next segment.
Additional Readings:
- Please find the proof of learning algorithm of the perceptron here.
Report an error