支持向量机SVM（Support Vector Machine）¶

【关键词】支持向量，最大几何间隔，拉格朗日乘子法¶

一、支持向量机的原理¶

Support Vector Machine。支持向量机，其含义是通过支持向量运算的分类器。其中“机”的意思是机器，可以理解为分类器。那么什么是支持向量呢？在求解的过程中，会发现只根据部分数据就可以确定分类器，这些数据称为支持向量。见下图，在一个二维环境中，其中点R，S，G点和其它靠近中间黑线的点可以看作为支持向量，它们可以决定分类器，也就是黑线的具体参数。

解决的问题：

线性分类

在训练数据中，每个数据都有n个的属性和一个二类类别标志，我们可以认为这些数据在一个n维空间里。我们的目标是找到一个n-1维的超平面（hyperplane），这个超平面可以将数据分成两部分，每部分数据都属于同一个类别。其实这样的超平面有很多，我们要找到一个最佳的。因此，增加一个约束条件：这个超平面到每边最近数据点的距离是最大的。也成为最大间隔超平面（maximum-margin hyperplane）。这个分类器也成为最大间隔分类器（maximum-margin classifier）。支持向量机是一个二类分类器。

非线性分类

SVM的一个优势是支持非线性分类。它结合使用拉格朗日乘子法和KKT条件，以及核函数可以产生非线性分类器。

SVM的目的是要找到一个线性分类的最佳超平面 f(x)=xw+b=0。求 w 和 b。

首先通过两个分类的最近点，找到f(x)的约束条件。

有了约束条件，就可以通过拉格朗日乘子法和KKT条件来求解，这时，问题变成了求拉格朗日乘子αi 和 b。

对于异常点的情况，加入松弛变量ξ来处理。

非线性分类的问题：映射到高维度、使用核函数。

线性分类及其约束条件¶

SVM的解决问题的思路是找到离超平面的最近点，通过其约束条件求出最优解。

最大几何间隔（geometrical margin）¶

求解问题w,b¶

我们使用拉格朗日乘子法(http://blog.csdn.net/on2way/article/details/47729419) 来求w和b，一个重要原因是使用拉格朗日乘子法后,还可以解决非线性划分问题。拉格朗日乘子法可以解决下面这个问题：

消除w之后变为：

可见使用拉格朗日乘子法后，求w,b的问题变成了求拉格朗日乘子αi和b的问题。到后面更有趣，变成了不求w了，因为αi可以直接使用到分类器中去，并且可以使用αi支持非线性的情况.

二、实战¶

1、画出决策边界¶

导包sklearn.svm

In [1]:

import numpy as np

from sklearn.svm import SVC

import matplotlib.pyplot as plt
%matplotlib inline

随机生成数据，并且进行训练np.r_[]

In [12]:

a = [[1,1],[2,2],[3,3]]
b = [[-1,0],[-2,2],[-3,1]]

#ravel() 平坦化，将二维的数据变成一维
np.r_[a,b]
# np.c_[a,b]

Out[12]:

array([[ 1,  1],
       [ 2,  2],
       [ 3,  3],
       [-1,  0],
       [-2,  2],
       [-3,  1]])

In [7]:

X_train = np.array([np.random.randn(20,2)-[2,2],np.random.randn(20,2)+[2,2]])
X_train = X_train.reshape((40,2))

In [83]:

X_train = np.r_[np.random.randn(20,2)-[2,2],np.random.randn(20,2)+[2,2]]

In [17]:

X_train.shape

Out[17]:

(40, 2)

In [77]:

plt.scatter(X_train[:,0],X_train[:,1])

Out[77]:

<matplotlib.collections.PathCollection at 0x7f0481c3cc88>

In [79]:

y_train = ['r']*20+['b']*20
y_train

Out[79]:

['r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'r',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b',
 'b']

In [84]:

plt.scatter(X_train[:,0],X_train[:,1],c = y_train)

Out[84]:

<matplotlib.collections.PathCollection at 0x7f0481b4dfd0>

训练模型，并训练

In [87]:

svc = SVC(kernel='linear')

In [88]:

#X_train两类数据点，y_train两类数据点的标签
#
svc.fit(X_train,y_train)

Out[88]:

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma='auto', kernel='linear',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

提取系数获取斜率

In [89]:

coef_ = svc.coef_
coef_

Out[89]:

array([[-0.50516578, -0.66315088]])

In [90]:

# w = (y1 - y2)/(x1 - x2)

#斜率获取到了

w = -coef_[0,0]/ coef_[0,1]

In [91]:

x = np.linspace(-4,4,100)

plt.scatter(X_train[:,0],X_train[:,1],c = y_train)

plt.plot(x,w*x)

Out[91]:

[<matplotlib.lines.Line2D at 0x7f0481b67b70>]

In [35]:

#求解截距
#bias  偏差
#机器学习，求解了函数之后，斜率确定了
#
#intercper_支持向量机求解出来的截距，标准
intercept_ = svc.intercept_

b = -intercept_[0]/coef_[0,1]

In [38]:

#X轴方向的系数
coef_[0,1]

Out[38]:

-0.62420659626979602

In [37]:

coef_
# (y1 - y2)/(x1 - x2)

Out[37]:

array([[-1.25933227, -0.6242066 ]])

In [36]:

plt.scatter(X_train[:,0],X_train[:,1],c = y_train)

#y = w*x + b

plt.plot(x,w*x + b)

Out[36]:

[<matplotlib.lines.Line2D at 0x7f0484660668>]

In [39]:

#获取支持向量
vectors_ = svc.support_vectors_
vectors_

Out[39]:

array([[-0.26902411,  0.69545925],
       [-1.54394121,  0.0635278 ]])

In [41]:

plt.scatter(X_train[:,0],X_train[:,1],c = y_train)

#y = w*x + b

plt.plot(x,w*x + b)

plt.scatter(vectors_[:,0],vectors_[:,1],s = 300,alpha=0.3)

Out[41]:

<matplotlib.collections.PathCollection at 0x7f04845702e8>

上边界和下边界
support_vectors_

In [42]:

#上边界和下边界的斜率是相同的

#y = w*x + b
#b = y - w*x

#求解上边界
upper = vectors_[0]

down = vectors_[-1]

upper_intercept = upper[1] - w*upper[0]
down_intercept = down[1] - w*down[0]

绘制图形

In [43]:

plt.scatter(X_train[:,0],X_train[:,1],c = y_train)

#y = w*x + b

plt.plot(x,w*x + b)

plt.scatter(vectors_[:,0],vectors_[:,1],s = 300,alpha=0.3)

#上边界的绘制
plt.plot(x,w*x + upper_intercept)

#下边界的绘制
plt.plot(x,w*x + down_intercept)

Out[43]:

[<matplotlib.lines.Line2D at 0x7f048454de48>]

2、SVM分离坐标点¶

导包

In [45]:

from sklearn.svm import SVC
import numpy as np

import matplotlib.pyplot as plt
%matplotlib inline

创造-3到3范围的点以及meshgrid

In [95]:

X_train = np.random.randn(200,2)

#创造目标值
plt.scatter(X_train[:,0],X_train[:,1])

Out[95]:

<matplotlib.collections.PathCollection at 0x7f04819ce128>

In [57]:

a = [True,False,False,True]
b = [True,True,False,False]
np.logical_and(a,b)

Out[57]:

array([ True, False, False, False], dtype=bool)

In [54]:

np.logical_or(a,b)

Out[54]:

array([ True,  True], dtype=bool)

In [55]:

#not 非
np.logical_not(a)

Out[55]:

array([False,  True], dtype=bool)

In [58]:

#xor 异或
np.logical_xor(a,b)

Out[58]:

array([False,  True, False,  True], dtype=bool)

In [ ]:

In [97]:

#np 逻辑运算符
y_train = np.logical_xor(X_train[:,0]>0,X_train[:,1]>0)

In [98]:

plt.scatter(X_train[:,0],X_train[:,1],c = y_train)

Out[98]:

<matplotlib.collections.PathCollection at 0x7f04819609e8>

创造模型：rbf，训练数据

In [99]:

#radius base function :基于半径的方法
#类似KNN 
svc = SVC(kernel='rbf')

svc.fit(X_train,y_train)

Out[99]:

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

In [ ]:

#svc 算法，就可以知道一三象限的数据是一类，二四象限的数据另一组

In [66]:

#创造预测数据
xx,yy = np.meshgrid(np.linspace(-3,3,500),np.linspace(-3,3,500))

#xy 平面中所有点，
xy = np.c_[xx.ravel(),yy.ravel()]

In [68]:

xy.shape

Out[68]:

(250000, 2)

绘制图形
绘制测试点到分离超平面的距离(decision_function)
绘制轮廓线
绘制训练点

In [62]:

X_train.shape

Out[62]:

(200, 2)

In [63]:

xy.shape

Out[63]:

(500, 1000)

In [100]:

#测试点到分离超平面的距离
y_ = svc.decision_function(xy)

In [107]:

plt.figure(figsize=(6,6))
#将测试点到分离超平面的距离绘制成了一张图片

plt.imshow(y_.reshape(xx.shape),extent=[-3,3,-3,3],cmap=plt.cm.PuOr_r)

#绘制轮廓,等高线，此圆圈上的测试点，到分离超平面的距离是相同的
plt.contourf(xx,yy,y_.reshape(xx.shape))

plt.contour(xx,yy,y_.reshape(xx.shape))

plt.scatter(X_train[:,0],X_train[:,1],c = y_train)

plt.axis([-3,3,-3,3])

Out[107]:

[-3, 3, -3, 3]

3、使用多种核函数对iris数据集进行分类¶

导包

In [128]:

import sklearn.datasets as datasets

iris = datasets.load_iris()

X_train = iris.data
y_train = iris.target

提取数据只提取两个特征，方便画图
创建支持向量机的模型：'linear', 'poly'(多项式), 'rbf'(Radial Basis Function:基于半径函数),

In [129]:

X_train = X_train[:,[2,3]]

In [110]:

X_train.shape

Out[110]:

(150, 2)

In [131]:

plt.scatter(X_train[:,0],X_train[:,1],c = y_train)

Out[131]:

<matplotlib.collections.PathCollection at 0x7f0480de47b8>

In [150]:

linear_svc = SVC(kernel='linear')
poly_svc = SVC(kernel='poly')
rbf_svc = SVC(kernel='rbf')
# The implementation is based on libsvm
sigmoid_svc = SVC(kernel='sigmoid')
'''It must be one of 'linear', 'poly', 'rbf', 'sigmoid', 'precomputed' or
     a callable'''

# precomputed_svc = SVC(kernel='precomputed')

# Similar to SVC with parameter kernel='linear', 
# but implemented in terms of liblinear rather than libsvm
from sklearn.svm import LinearSVC
lsvc = LinearSVC()

训练模型

In [152]:

estimators = {'linear_svc':linear_svc,'poly_svc':poly_svc,
              'rbf_svc':rbf_svc,'sigmoid_svc':sigmoid_svc,
             'lsvc':lsvc}

In [137]:

for key,estismator in estimators.items():
    estismator.fit(X_train,y_train)

图片背景点

In [138]:

#数据预测，背景当中点，提取出来

xmin,xmax = X_train[:,0].min(),X_train[:,0].max()

ymin,ymax = X_train[:,1].min(),X_train[:,1].max()

xx,yy = np.meshgrid(np.linspace(xmin,xmax,700),np.linspace(ymin,ymax,300))

#交叉X轴和Y轴的点
xy = np.c_[xx.ravel(),yy.ravel()]

In [140]:

xx.shape

Out[140]:

(300, 700)

In [139]:

xy.shape

Out[139]:

(210000, 2)

预测并绘制图形for循环绘制图形

In [154]:

for i,key in enumerate(estimators):
    print(i,key)

0 sigmoid_svc
1 lsvc
2 poly_svc
3 linear_svc
4 rbf_svc

In [153]:

plt.figure(figsize=(12,9))

for i,key in enumerate(estimators):
    esitmator = estimators[key]
    
    #训练
    esitmator.fit(X_train,y_train)
    #预测数据之前，先进行训练
    
    #预测
    y_ = esitmator.predict(xy)
    
    axes = plt.subplot(2,3,i+1)
    
    axes.pcolormesh(xx,yy,y_.reshape((300,700)),cmap = 'cool')
    
    axes.scatter(X_train[:,0],X_train[:,1],c = y_train,cmap = 'rainbow')
    
    axes.set_title(key)

4、使用SVM多种核函数进行回归¶

导包

In [155]:

from sklearn.svm import SVR

自定义样本点rand，并且生成sin值

In [159]:

X_train = 8*np.random.rand(80,1)

In [160]:

y_train = np.sin(X_train)

In [162]:

y_train.shape

Out[162]:

(80, 1)

In [161]:

plt.scatter(X_train,y_train)

Out[161]:

<matplotlib.collections.PathCollection at 0x7f04819810b8>

数据加噪

In [163]:

y_train[::5] += np.random.randn(16,1)*0.3

In [164]:

plt.scatter(X_train,y_train)

Out[164]:

<matplotlib.collections.PathCollection at 0x7f0481a88fd0>

建立模型，训练数据，并预测数据，预测训练数据就行

In [168]:

esitimators = {'linear_svr':SVR(kernel='linear'),
              'poly_svr':SVR(kernel='poly'),
              'rbf_svr':SVR(kernel='rbf'),
              'sigmoid_svr':SVR(kernel='sigmoid')}

绘制图形，观察三种支持向量机内核不同

In [171]:

for i,key in enumerate(esitimators):
    
    esitmator = esitimators[key]
    
    #训练
    esitmator.fit(X_train,y_train)
    
    #预测
    x_test = np.linspace(0,8,200).reshape((200,1))
    
    y_ = esitmator.predict(x_test)
    
    plt.plot(x_test,y_,label = key)
    
    plt.legend()
    if i == 3:
        plt.scatter(X_train,y_train)
        plt.axis([0,8,-2,2])

/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py:526: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)

三、作业¶

1、使用不同核对下面三个数据集进行分类，并画出分类边界¶

ex6data1.mat
ex6data2.mat
ex6data3.mat

2、使用SVC对cars.txt进行分析¶

In [ ]:

这是一个关于汽车测评的数据集，类别变量为汽车的测评，（unacc，ACC，good，vgood）分别代表（不可接受，可接受，好，非常好），而6个属性变量分别为「买入价」，「维护费」，「车门数」，「可容纳人数」，「后备箱大小」，「安全性」。值得一提的是6个属性变量全部是有序类别变量，比如「可容纳人数」值可为「2，4，more」，「安全性」值可为「low, med, high」

price、maint、doors、persons、lug_boot、safty、recommend

In [ ]: