I’ve been thinking about writing a blog post on what I have been doing these days. I thought, is there anything I can write about? Not just anything I learned, but something I can contribute to the public? And realized that few days ago I searched the web for a problem, but I couldn’t find the correct answer right away. Then I had to solve it myself. The problem is how to find margin lines from the decision boundary obtained from the linear SVM (Support Vector Machine) classifier.
This is not a hard problem to solve, but it needs a little bit of understanding on how SVM works. First, I will go through basic formula, and then get margin lines for two-feature cases.
If we have two continuous features, and , and binary class, , SVM finds the decision boundary using the following equation. For a training set of points
[ and for , where ], we try to maximize , when and for . Here is interpreted as the geometrical margin, the shortest distance from the decision boundary to a data point in the training set. The parameter space we are exploring is actually two-dimensional (the direction of , and the intercept term ) because is assumed to be a unit vector. This optimization problem is not an easy problem to solve in general. Hence we change the problem by setting the constraint of as instead. Now the length of is not fixed, rather changes as the direction of and change, depending on the given data set. The parameter space we are exploring is the same as before; only the constraint of the length of is changed. Now the formula became simpler: we need to minimize when for . Minimizing is the same as maximizing actually, and this problem is now represented in an easier form to handle. We are not going into detials on how we solve this optimization problem, but we will keep in mind that , where is the geometrical margin.
Let’s say we are using a python library, sklearn.svm.SVC
. If we fit the training data, the package gives us optimal parameter values after solving above formula: coef_[0][0]
as , coef_[0][1]
as , and intercept_[0]
as . The decision boundary can be found easily from these parameters: , which is . We know the margin is , but how much should we add/subtract to the intercept term of a margin line, given the decision boundary? If we call that value as , it can be found using trigonometry (see the figure).
Then, is just (it can be either positive or negative, but it doesn’t matter because we need both).
Then a part of the python code for plotting these lines looks like this.
yy = -(w[0] / w[1]) * xx - (b / w[1]) yy_margin1 = yy + 1 / w[1] yy_margin2 = yy - 1 / w[1]
I know this is not a hard problem, but it can be handy when we want to plot margin lines. (I found that some example codes at sklearn.svm.SVC were not correct at this moment, when they draw margin lines. One only works for separable cases, while the other one uses for , which is a little off.)
PS: All blog posts I have written so far (actually I haven’t written any for more than five years) are in Korean. So this will be my first blog post in English.