ESF projekt Západočeské univerzity v Plzni reg. č. CZ.02.2.69/0.0/0.0/16 015/0002287

# Image Classification - clasical approaches¶

Basic types of classifiers:

• K-means
• k-Nearest Neighbour
• Bayesian classifier
• Support Vector Machine

Types of learning:

• Supervised learning
• Unsupervised learning
• Reinforcement learning
In [1]:
# scikit-learn
%pylab inline --no-import-all
from sklearn import datasets
import numpy as np
import sklearn.model_selection

Populating the interactive namespace from numpy and matplotlib


### Iris dataset¶

• Načtení trénovacích dat. Jde o kosatec (iris flower) a jeho tři poddruhy: Iris setosa, Iris versicolor, Iris virginica. Měří se délka kalichu, šířka kalichu, délka okvětního lístku a šířka okvětního lístku.
In [2]:
iris = datasets.load_iris()
# cílové třídy
# rozměry dat
print("data ", iris.data.shape)
print(iris.data[-10:,:])

print("")
print("target", iris.target.shape)
print(np.unique(iris.target))
print(iris.target[-10:])

data  (150, 4)
[[6.7 3.1 5.6 2.4]
[6.9 3.1 5.1 2.3]
[5.8 2.7 5.1 1.9]
[6.8 3.2 5.9 2.3]
[6.7 3.3 5.7 2.5]
[6.7 3.  5.2 2.3]
[6.3 2.5 5.  1.9]
[6.5 3.  5.2 2. ]
[6.2 3.4 5.4 2.3]
[5.9 3.  5.1 1.8]]

target (150,)
[0 1 2]
[2 2 2 2 2 2 2 2 2 2]


### k-Nearest Neighbour classifier¶

In [3]:
from sklearn import neighbors
knn = neighbors.KNeighborsClassifier()
knn.fit(iris.data, iris.target)
#KNeighborsClassifier(...)
predikce = knn.predict([[0.1, 0.2, 0.3, 0.4]])
print(predikce)
#array([0])

[0]

In [4]:
perm = np.random.permutation(iris.target.size)
iris.data = iris.data[perm]
iris.target = iris.target[perm]

train_data = iris.data[:100]
train_target = iris.target[:100]

test_data = iris.data[100:]
test_target = iris.target[100:]

knn.fit(train_data, train_target)

knn.score(test_data, test_target)

Out[4]:
0.94

### Bayesian classifier¶

In [5]:
import sklearn.naive_bayes
gnb = sklearn.naive_bayes.GaussianNB()
gnb.fit(train_data, train_target)
y_pred = gnb.predict(test_data)
print("Number of mislabeled points : %d" % (test_target != y_pred).sum())

Number of mislabeled points : 5


### SVM classifier¶

In [6]:
from sklearn import svm
svc = svm.SVC()
svc.fit(train_data, train_target)
y_pred = svc.predict(test_data)
print("Number of mislabeled points : %d" % (test_target != y_pred).sum())

Number of mislabeled points : 3


### Testing data¶

In [8]:
import scipy
import urllib
import skimage
import skimage.color
import skimage.measure
import skimage.io
from sklearn import svm

# URL = "http://uc452cam01-kky.fav.zcu.cz/snapshot.jpg"
URL = "https://raw.githubusercontent.com/mjirik/ZDO/master/objekty/ctverce_hvezdy_kolecka.jpg"

<matplotlib.image.AxesImage at 0x208dcbaa208>