Support Vector Machines

Support Vector Machines use modeling data that represent vectors in multi-dimensional spaces. During model training, ‘support vectors’ that separate clusters of data are calculated and used to predict to which cluster prediction input data falls. In the illustration below, the support vector data points are marked Sc and the support vector lines are marked Vs.

$\\r_{1}=\vec{w}\cdot \vec{x}-b=1 \\r_{2}=\vec{w}\cdot \vec{x}-b=-1 \\h=\vec{w}\cdot \vec{x}-b=0 \\\\m=\frac{2}{\left \|\vec{w} \right \|} \\\\e=\frac{b}{\left \|\vec{w} \right \|}$

Kernel Methods

Kernel methods:

are a class of algorithms for pattern analysis, such as in Support Vector Machines
are used to find relations within datasets
use vector inner products (aka dot products) to reduce dimensionality
include the so called “kernel trick” enabling the use of higher-dimension spaces without explicitly calculating the coordinates of points within those dimensions

Training Process

Training an SVM is a process of finding the hyperplane with the optimal margins given a configuration of class data points.

Determining Initial Support Vector Points for a Hyperplane

In order to avoid a brute force approach of testing all possible hyperplanes and associated data points, the approach below uses knowable measurements to greatly reduce the testing needed to find the optimal hyperplane.

SVM Training Process Initial Data Points.png

Calculate the centroids C which is calculated to minimize the total distance of a class data points from the centroid.
Create a best fit line L between the three centroids.
Create a hyperplane line H perpendicular to line L at the double class centroid.
Slide the hyperplane line H across line L so that it lies between at least on point from each of the two classes a, b.
Find the closest pair of data points P1, P2 from the two classes a, b to line H using euclidean distance.

Determining the Optimal Hyperplane

An example is performing the process on two linearly separable classes of data points as illustrated in the graphic below:

Find the euclidean distance d between the pair of data points P1, P2.
Find the mid-point m between the data points P1, P2.
Create a hyperplane center line through the mid-point m toward the double class centroid C.
Create hyperplane margin lines through P1, P2 parallel to the hyperplane center line in the direction of the double class centroid C.
Rotate the hyperplane center and parallel margin lines clockwise/counter-clockwise around the mid-point m until:
1. one of the margin lines rests on another data point as in the illustration at P3, or
2. the hyperplane center line is perpendicular to the distance line d which will result in the best possible margins.

Python Example

To download the code below, click here.

"""
support_vector_machine.py
creates and tests a support vector machine
"""

# Import needed libraries.
import matplotlib.pyplot as plotlib
from sklearn import svm
from sklearn.datasets import make_blobs

# Define parameters.
number_of_data_points = 60
number_of_classes = 2
first_class_numeric_value = 0
second_class_numeric_value = 1
random_state = 6
kernel_type = 'linear'
regularization_parameter = 1000
data_points_scatter_plot_array_shape = number_of_data_points/2
support_vectors_scatter_plot_array_shape = 100
scatter_plot_color_map = plotlib.cm.Paired
support_vector_data_point_outline_color = 'k'
support_vector_data_point_outline_width = 1
support_vector_data_point_outline_fill = 'none'
prediction_input_test = [[8, -10]]

# Create data points and associated classes.
data_points, classes = make_blobs(
    n_samples=number_of_data_points,
    centers=number_of_classes,
    random_state=random_state)

# Instantiate a support vector machine model.
model = svm.SVC(
    kernel=kernel_type,
    C=regularization_parameter)

# Train the model.
model.fit(data_points, classes)

# Predict a class.
result = model.predict(prediction_input_test)
print('prediction for data point ' + str(prediction_input_test) + ' :')
print(result)

# Create a scatter plot of data points.
plotlib.scatter(
    data_points[:, first_class_numeric_value],
    data_points[:, second_class_numeric_value],
    c=classes,
    s=data_points_scatter_plot_array_shape,
    cmap=scatter_plot_color_map)

# Plot the support vectors.
axes = plotlib.gca()
axes.scatter(
    model.support_vectors_[:, first_class_numeric_value],
    model.support_vectors_[:, second_class_numeric_value],
    s=support_vectors_scatter_plot_array_shape,
    linewidth=support_vector_data_point_outline_width,
    facecolors=support_vector_data_point_outline_fill,
    edgecolors=support_vector_data_point_outline_color)

# Display the plot.
plotlib.show()

The output is below:

prediction for data point [[8, -10]] :
[1]