Hyperparameters of an SVM with an RBF kernel

You have JavaScript disabled. To work properly, this site requires JavaScript to be turned on. Otherwise, some features won't work (e.g. math formulas, animations).

Hyperparameters of an SVM with an RBF kernel

A Support vector machine (SVM) is a popular choice for a classifier and radial basis functions (RBFs) are commonly used kernels to apply SVMs also to non-linearly separable problems. There are two hyperparameters in this case. First, the margin is maximized by minimizing the function

\begin{equation*} \varphi(\fvec{w}, \delta) = \frac{\left\| \fvec{w} \right\|_2^2}{2} + C \sum_{i=1}^{N} \delta_i \end{equation*}

with the weight vector \(\fvec{w}\) and the slack variables \(\delta_i \geq 0\). Here, we have to tune the regularization parameter \(C \in \mathbb{R}^+\). Second, the RBF kernel

\begin{equation*} k(\fvec{x}_i, \fvec{x}_j) = e^{-\gamma \left\| \fvec{x}_i - \fvec{x}_j \right\|_2^2} \end{equation*}

which calculates the distance between the data points \(\fvec{x}_i\) introduces the tunable scaling parameter \(\gamma \in \mathbb{R}^+\).

In the following animation, you can control both parameters and switch between a linear and an RBF kernel. It uses data points from the Iris flower dataset showing two features and two classes (selected to be non-separable). The idea is inspired by this sklearn example.

Kernel:
\(\log_2(C) =\) \(\log_2(\gamma) =\)

Figure 1: SVM with an RBF kernel applied to a non-separable problem. Data points from two classes are shown. The heatmap shows the unnormalized distance \(d(\fvec{\phi}(\fvec{x}), H) \cdot \left\| \fvec{w} \right\|_2\) to the hyperplane \(H\) in the projection space (defined by \(\fvec{\phi}(\fvec{x})\)). New points are assigned to a class based on this distance. For positive distances, points are assigned to the Virginica class and for negative distance a point ends up the the Versicolor class. When the linear kernel is selected, also the decision margin together with two equidistant parallel lines are shown (the additional lines correspond to distance values of +1.0 and -1.0). Both hyperparameters can be adjusted in a logarithmic scaling (the calculated values are also shown in the top of the figure). The \(\gamma\) hyperparameter has no effect when the linear kernel is used.

List of attached files:

SVMParametersRBF.ipynb (Jupyter notebook used to create the visualization)

← Back to the overview page