SVM hyperplane visualization based on libsvm

Mar 06, 2018

Support vector machine (SVM), as a shallow model, has been widely applied for classification tasks. To solve the model, groups of super vectors (SVs) of corresponding classes are extracted, so as to calculate a hyperplane as the classification boarder.

A brief review

Denote $\mathbf{x} = \{\mathbf{x_1},~\mathbf{x_2},~\dots,~\mathbf{x_N}\}$ as the samples to be classified, and $y = \{y_1,~y_2,~\dots,~y_N\}$ are the corresponding labels. Take binary classification as an example,

$\begin{equation} y_i(\mathbf{w}\cdot\mathbf{x_i}+b) \geq 1, i = 1,~2,~\dots,~N, \end{equation}$

where $\mathbf{w}$ are the coefficients w.r.t. features in $\mathbf{x}$ , b is the bias.

The the problem becomes an optimization task, where the object is,

$\begin{equation} \begin{cases} \min\limits_{w}\frac{\left \| \mathbf{w} \right \|}{2}, \\ \mathrm{s.t.}~y_i(\mathbf{w}\cdot\mathbf{x_i}+b) \geq 1, i = 1,~2,~\dots,~N, \end{cases} \end{equation}$

which shall be calculated with Lagrange equation,

$\begin{equation} L_P = \frac{1}{2}{\left\| \mathbf{w} \right \|} - \sum^{N}_{i=1}{\lambda_i\{y_i(\mathbf{w}\cdot\mathbf{x_i}+b)-1\}}. \end{equation}$

To save time, it usually selects a subset of $\mathbf{x}$ namely super vectors to optimize above equation, instead of all of the samples. Those SVs are samples that stand close to the classification hyperplane, i.e., the boarders of different types. They are considered on behalf of the classes they belonging to.

By solving the Lagrange equation, we obtain the $\mathbf{\lambda}$ , as well as $\mathbf{w}$ and $b$.

$\begin{equation} \begin{cases} \mathbf{w} = \sum^{N_\rm{SV}}_{i=1}{\lambda_i y_i \mathbf{x_i}}, \\ b = - \frac{1}{2}\mathbf{w}\cdot(\mathbf{x_{c1}}+\mathbf{x_{c2}}), \end{cases} \end{equation}$

where $\mathbf{x_c1}$ and $\mathbf{x_c2}$ are arbitrary super vectors of class one and two, respectively.

The dicision function based on those parameters are,

$\begin{equation} f(\mathbf{x_s}) = \rm{sgn}\left[ \sum^{N_\rm{SV}}_{i=1}{\lambda_i y_i (\mathbf{x_i}\cdot\mathbf{x_s}) + b} \right], \end{equation}$

here $\rm{sgn}$ is the sign function.

For non-linear classification, which is more general than linear case, the dot product between $\mathbf{x_i}$ and $\mathbf{x_s}$ are replaced by non-linear kernel functions $\Phi(\cdot)$, i.e.,

$\begin{equation} f(\mathbf{x_s}) = \rm{sgn}\left[ \sum^{N_\rm{SV}}_{i=1}{\lambda_i y_i \Phi(\mathbf{x_i},\mathbf{x_s})+b} \right]. \end{equation}$

Realization and visualization

With the help of libsvm, it is easy to realize SVM based classification. What I want to say in this blog is how to visualize or replicate the prediction stage of the svmtrain function. Some comments are as follows,

After training the SVM with svmtrain, a model will be generated;
In the model, super vectors, parameters like weights and bias, are archived;
To save space, the support vectors are saved as sparse matrix.
For multi-class classification, it can be transformed to multiple binary-classification tasks.

Here is a naive two-dimensional three-type classification example (code is available). I divided three-class task into three binary classifications. The linear kernel function was used, thus the classification hyperplanes were also linear.

In the right figure, only support vector points are plotted. It can be found that the SVs are those points stand at the boarder between different categories.

Comment and share

BP神经网络sim函数的MATLAB和C实现

Sep 06, 2017

这两天在写论文，天天写程序，没什么新脑洞，而且最近也遇到不少事。手动转载一下几篇新浪博客的文章，有几篇的阅读量和转载还挺不错。。。只是当时的文风也是醉了。
Continue reading

SVM hyperplane visualization based on libsvm

A brief review

Realization and visualization

BP神经网络sim函数的MATLAB和C实现

Jason Ma