fashion-mnist

fashion-mnist#

Notebook by:

Royi Avital RoyiAvital@fixelalgorithms.com

Revision History#

Version	Date	User	Content / Changes
1.0.000	17/03/2024	Royi Avital	First version

# Import Packages

# General Tools
import numpy as np
import scipy as sp
import pandas as pd

# Machine Learning
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC

# Image Processing

# Machine Learning

# Miscellaneous
import os
from platform import python_version
import random

# Typing
from typing import Callable, Dict, List, Optional, Set, Tuple, Union

# Visualization
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns

# Jupyter
from IPython import get_ipython

Notations#

(?) Question to answer interactively.
(!) Simple task to add code for the notebook.
(@) Optional / Extra self practice.
(#) Note / Useful resource / Food for thought.

Code Notations:

someVar    = 2; #<! Notation for a variable
vVector    = np.random.rand(4) #<! Notation for 1D array
mMatrix    = np.random.rand(4, 3) #<! Notation for 2D array
tTensor    = np.random.rand(4, 3, 2, 3) #<! Notation for nD array (Tensor)
tuTuple    = (1, 2, 3) #<! Notation for a tuple
lList      = [1, 2, 3] #<! Notation for a list
dDict      = {1: 3, 2: 2, 3: 1} #<! Notation for a dictionary
oObj       = MyClass() #<! Notation for an object
dfData     = pd.DataFrame() #<! Notation for a data frame
dsData     = pd.Series() #<! Notation for a series
hObj       = plt.Axes() #<! Notation for an object / handler / function handler

Code Exercise#

Single line fill

vallToFill = ???

Multi Line to Fill (At least one)

# You need to start writing
????

Section to Fill

#===========================Fill This===========================#
# 1. Explanation about what to do.
# !! Remarks to follow / take under consideration.
mX = ???

???
#===============================================================#

# Configuration
# %matplotlib inline

seedNum = 512
np.random.seed(seedNum)
random.seed(seedNum)

# Matplotlib default color palette
lMatPltLibclr = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b', '#e377c2', '#7f7f7f', '#bcbd22', '#17becf']
# sns.set_theme() #>! Apply SeaBorn theme

runInGoogleColab = 'google.colab' in str(get_ipython())

# Constants

FIG_SIZE_DEF    = (8, 8)
ELM_SIZE_DEF    = 50
CLASS_COLOR     = ('b', 'r')
EDGE_COLOR      = 'k'
MARKER_SIZE_DEF = 10
LINE_WIDTH_DEF  = 2

# Fashion MNIST
TRAIN_DATA_SET_IMG_URL = r'https://github.com/zalandoresearch/fashion-mnist/raw/master/data/fashion/train-images-idx3-ubyte.gz'
TRAIN_DATA_SET_LBL_URL = r'https://github.com/zalandoresearch/fashion-mnist/raw/master/data/fashion/train-labels-idx1-ubyte.gz'
TEST_DATA_SET_IMG_URL  = r'https://github.com/zalandoresearch/fashion-mnist/raw/master/data/fashion/t10k-images-idx3-ubyte.gz'
TEST_DATA_SET_LBL_URL  = r'https://github.com/zalandoresearch/fashion-mnist/raw/master/data/fashion/t10k-labels-idx1-ubyte.gz'

TRAIN_DATA_IMG_FILE_NAME = 'TrainImgFile'
TRAIN_DATA_LBL_FILE_NAME = 'TrainLblFile'
TEST_DATA_IMG_FILE_NAME  = 'TestImgFile'
TEST_DATA_LBL_FILE_NAME  = 'TestLblFile'

TRAIN_DATA_SET_FILE_NAME = 'FashionMnistTrainDataSet.npz'
TEST_DATA_SET_FILE_NAME  = 'FashionMnistTestDataSet.npz'

TRAIN_DATA_NUM_IMG  = 60_000
TEST_DATA_NUM_IMG   = 10_000

D_CLASSES = {0: 'T-Shirt', 1: 'Trouser', 2: 'Pullover', 3: 'Dress', 4: 'Coat', 5: 'Sandal', 6: 'Shirt', 7: 'Sneaker', 8: 'Bag', 9: 'Boots'}
L_CLASSES = ['T-Shirt', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Boots']

# Courses Packages
import sys,os
sys.path.append('../')
sys.path.append('../../')
sys.path.append('../../../')
from utils.DataVisualization import PlotConfusionMatrix, PlotLabelsHistogram, PlotMnistImages
from utils.DataManipulation import DownloadDecompressGzip, ConvertMnistDataDf

# General Auxiliary Functions

def IsStrFloat(inStr: any) -> bool:
    #Support None input
    if inStr is None: 
        return False
    try:
        float(inStr)
        return True
    except ValueError:
        return False

Model Parameter Optimization with Kernel SVM#

In this exercise we’ll apply the Cross Validation automatically to find the optimal hyper parameters for the Kernel SVM Model.
In order to achieve this we’ll do a Grid Search for Hyper Parameters Optimization.

Load the Fashion MNIST Data Set manually (Done by the notebook).
Train a baseline Linear SVM model.
Find the optimal Kernel SVM model using Grid Search.
Extract the optimal model.
Plot the Confusion Matrix of the best model on the training data.

(#) You may and should use the functions in the Auxiliary Functions section.

# Parameters

numSamplesTrain = 4_000
numSamplesTest  = 1_000
numImg = 3

# Linear SVM (Baseline Model)
paramC      = 1
kernelType  = 'linear'

#===========================Fill This===========================#
# 1. Think of the parameters to optimize.
# 2. Select the set to optimize over.
# 3. Set the number of folds in the cross validation.
numFold = 5
#===============================================================#

Generate / Load Data#

Load the Fashion MNIST Data Set.

# Load Data 

if os.path.isfile(TRAIN_DATA_SET_FILE_NAME):
    dData = np.load(TRAIN_DATA_SET_FILE_NAME)
    mXTrain, vYTrain = dData['mXTrain'], dData['vYTrain']
else:
    if not os.path.isfile(TRAIN_DATA_IMG_FILE_NAME):
        DownloadDecompressGzip(TRAIN_DATA_SET_IMG_URL, TRAIN_DATA_IMG_FILE_NAME) #<! Download Data (GZip File)
    if not os.path.isfile(TRAIN_DATA_LBL_FILE_NAME):
        DownloadDecompressGzip(TRAIN_DATA_SET_LBL_URL, TRAIN_DATA_LBL_FILE_NAME) #<! Download Data (GZip File)
    mXTrain, vYTrain = ConvertMnistDataDf(TRAIN_DATA_IMG_FILE_NAME, TRAIN_DATA_LBL_FILE_NAME)
    np.savez_compressed(TRAIN_DATA_SET_FILE_NAME, mXTrain  = mXTrain, vYTrain = vYTrain)
    if os.path.isfile(TRAIN_DATA_IMG_FILE_NAME):
        os.remove(TRAIN_DATA_IMG_FILE_NAME)
    if os.path.isfile(TRAIN_DATA_LBL_FILE_NAME):
        os.remove(TRAIN_DATA_LBL_FILE_NAME)

if os.path.isfile(TEST_DATA_SET_FILE_NAME):
    dData = np.load(TEST_DATA_SET_FILE_NAME)
    mXTest, vYTest = dData['mXTest'], dData['vYTest']
else:
    if not os.path.isfile(TEST_DATA_IMG_FILE_NAME):
        DownloadDecompressGzip(TEST_DATA_SET_IMG_URL, TEST_DATA_IMG_FILE_NAME) #<! Download Data (GZip File)
    if not os.path.isfile(TEST_DATA_LBL_FILE_NAME):
        DownloadDecompressGzip(TEST_DATA_SET_LBL_URL, TEST_DATA_LBL_FILE_NAME) #<! Download Data (GZip File)
    mXTest, vYTest = ConvertMnistDataDf(TEST_DATA_IMG_FILE_NAME, TEST_DATA_LBL_FILE_NAME)
    np.savez_compressed(TEST_DATA_SET_FILE_NAME, mXTest = mXTest, vYTest = vYTest)
    if os.path.isfile(TEST_DATA_IMG_FILE_NAME):
        os.remove(TEST_DATA_IMG_FILE_NAME)
    if os.path.isfile(TEST_DATA_LBL_FILE_NAME):
        os.remove(TEST_DATA_LBL_FILE_NAME)


vSampleIdx = np.random.choice(mXTrain.shape[0], numSamplesTrain)
mXTrain = mXTrain[vSampleIdx, :]
vYTrain = vYTrain[vSampleIdx]

vSampleIdx = np.random.choice(mXTest.shape[0], numSamplesTest)
mXTest = mXTest[vSampleIdx, :]
vYTest = vYTest[vSampleIdx]


print(f'The number of train data samples: {mXTrain.shape[0]}')
print(f'The number of train features per sample: {mXTrain.shape[1]}') 
print(f'The unique values of the train labels: {np.unique(vYTrain)}')
print(f'The number of test data samples: {mXTest.shape[0]}')
print(f'The number of test features per sample: {mXTest.shape[1]}') 
print(f'The unique values of the test labels: {np.unique(vYTest)}')

The number of train data samples: 4000
The number of train features per sample: 784
The unique values of the train labels: [0 1 2 3 4 5 6 7 8 9]
The number of test data samples: 1000
The number of test features per sample: 784
The unique values of the test labels: [0 1 2 3 4 5 6 7 8 9]

Pre Process Data#

The image data is in the UInt8 data form with values in {0, 1, 2, ..., 255}.
The pre process step scales it into [0, 1] range.

# Pre Process Data

#===========================Fill This===========================#
# 1. Scale data into [0, 1] range.
mXTrain = mXTrain / 255
mXTest  = mXTest / 255
#===============================================================#

### TODO - check if will be the same for each pixel to be 0 or 1 !!!!

Plot Data#

# Plot the Data

hF = PlotMnistImages(mXTrain, vYTrain, numImg, lClasses = L_CLASSES)

../../../../_images/459f8d6973b75d622e80340ec329f842b64d425aebb8406e1d9167ef61de278e.png

# Plot the Histogram of Classes

hA = PlotLabelsHistogram(vYTrain)
hA.set_xticks(range(len(L_CLASSES)))
hA.set_xticklabels(L_CLASSES)
plt.show()

../../../../_images/33775ca70c1210ec7f5caa198ab71ff583a31cfaf8b9bd4a88ed2f2833fa3d7a.png

Train Linear SVM Classifier#

The Linear SVM will function as the baseline classifier.

# SVM Linear Model
#===========================Fill This===========================#
# 1. Construct a baseline model (Linear SVM).
# 2. Train the model.
# 3. Score the model (Accuracy). Keep result in a variable named `modelScore`.
svm_linear = SVC(C = paramC, kernel = kernelType).fit(mXTrain, vYTrain)
modelScore = svm_linear.score(mXTest, vYTest)
#===============================================================#

print(f'The model score (Accuracy) on the data: {modelScore:0.2%}') #<! Accuracy

The model score (Accuracy) on the data: 79.30%

Train Kernel SVM#

In this section we’ll train a Kernel SVM. We’ll find the optimal kernel and other hyper parameters by cross validation.
In order to optimize on the following parameters: C, kernel and gamma we’ll use GridSearchCV().
The idea is iterating over the grid of parameters of the model to find the optimal one.
Each parameterized model is evaluated by a Cross Validation.

In order to use it we need to define:

The Model (estimator) - Which model is used.
The Parameters Grid (param_grid) - The set of parameter to try.
The Scoring (scoring) - The score used to define the best model.
The Cross Validation Iterator (cv) - The iteration to validate the model.

(#) Pay attention to the expected run time. Using verbose is useful.
(#) This is a classic grid search which is not the most efficient policy. There are more advanced policies.
(#) The GridSearchCV() is limited to one instance of an estimator.
Yet using Pipelines we may test different types of estimators.

# Construct the Grid Search object 

#===========================Fill This===========================#
# 1. Set the parameters to iterate over and their values.
dParams = {
            'C' : [1,2,3,4],
            'kernel' : ['linear', 'poly', 'rbf', 'sigmoid'],
            'gamma' : ["scale", "auto"],
            }
#===============================================================#

oGsSvc = GridSearchCV(estimator = SVC(), param_grid = dParams, scoring = None, cv = numFold, verbose = 4,n_jobs=4)

TODO

run#

# Hyper Parameter Optimization
# Training the model with each combination of hyper parameters.

#===========================Fill This===========================#
# 1. The model trains on the train data using Stratified K Fold cross validation.
oGsSvc = oGsSvc.fit(mXTrain, vYTrain) #<! It may take few minutes
#===============================================================#

Fitting 5 folds for each of 32 candidates, totalling 160 fits

[CV 1/5] END ...C=1, gamma=scale, kernel=linear;, score=0.820 total time=   5.4s
[CV 3/5] END ...C=1, gamma=scale, kernel=linear;, score=0.816 total time=   5.4s
[CV 4/5] END ...C=1, gamma=scale, kernel=linear;, score=0.823 total time=   5.6s
[CV 2/5] END ...C=1, gamma=scale, kernel=linear;, score=0.841 total time=   5.7s
[CV 5/5] END ...C=1, gamma=scale, kernel=linear;, score=0.828 total time=   2.5s
[CV 1/5] END .....C=1, gamma=scale, kernel=poly;, score=0.787 total time=   2.8s
[CV 2/5] END .....C=1, gamma=scale, kernel=poly;, score=0.794 total time=   2.6s
[CV 3/5] END .....C=1, gamma=scale, kernel=poly;, score=0.779 total time=   4.1s
[CV 5/5] END .....C=1, gamma=scale, kernel=poly;, score=0.807 total time=   2.2s
[CV 4/5] END .....C=1, gamma=scale, kernel=poly;, score=0.786 total time=   3.2s
[CV 1/5] END ......C=1, gamma=scale, kernel=rbf;, score=0.841 total time=   3.1s
[CV 2/5] END ......C=1, gamma=scale, kernel=rbf;, score=0.849 total time=   3.7s
[CV 3/5] END ......C=1, gamma=scale, kernel=rbf;, score=0.816 total time=   3.3s
[CV 4/5] END ......C=1, gamma=scale, kernel=rbf;, score=0.840 total time=   3.3s
[CV 5/5] END ......C=1, gamma=scale, kernel=rbf;, score=0.849 total time=   3.5s
[CV 2/5] END ..C=1, gamma=scale, kernel=sigmoid;, score=0.417 total time=   3.6s
[CV 4/5] END ..C=1, gamma=scale, kernel=sigmoid;, score=0.451 total time=   4.1s
[CV 1/5] END ..C=1, gamma=scale, kernel=sigmoid;, score=0.411 total time=   6.0s
[CV 3/5] END ..C=1, gamma=scale, kernel=sigmoid;, score=0.425 total time=   5.1s
[CV 5/5] END ..C=1, gamma=scale, kernel=sigmoid;, score=0.401 total time=   3.7s
[CV 3/5] END ....C=1, gamma=auto, kernel=linear;, score=0.816 total time=   1.8s
[CV 1/5] END ....C=1, gamma=auto, kernel=linear;, score=0.820 total time=   2.8s
[CV 2/5] END ....C=1, gamma=auto, kernel=linear;, score=0.841 total time=   2.8s
[CV 4/5] END ....C=1, gamma=auto, kernel=linear;, score=0.823 total time=   2.5s
[CV 5/5] END ....C=1, gamma=auto, kernel=linear;, score=0.828 total time=   2.2s
[CV 2/5] END ......C=1, gamma=auto, kernel=poly;, score=0.439 total time=   7.6s
[CV 4/5] END ......C=1, gamma=auto, kernel=poly;, score=0.431 total time=   7.5s
[CV 1/5] END ......C=1, gamma=auto, kernel=poly;, score=0.430 total time=  11.7s
[CV 3/5] END ......C=1, gamma=auto, kernel=poly;, score=0.432 total time=  10.6s
[CV 1/5] END .......C=1, gamma=auto, kernel=rbf;, score=0.805 total time=   5.1s
[CV 2/5] END .......C=1, gamma=auto, kernel=rbf;, score=0.792 total time=  10.3s
[CV 3/5] END .......C=1, gamma=auto, kernel=rbf;, score=0.772 total time=  10.4s
[CV 5/5] END ......C=1, gamma=auto, kernel=poly;, score=0.427 total time=  14.7s
[CV 4/5] END .......C=1, gamma=auto, kernel=rbf;, score=0.796 total time=  12.9s
[CV 5/5] END .......C=1, gamma=auto, kernel=rbf;, score=0.774 total time=  14.2s
[CV 1/5] END ...C=1, gamma=auto, kernel=sigmoid;, score=0.770 total time=  14.1s
[CV 2/5] END ...C=1, gamma=auto, kernel=sigmoid;, score=0.782 total time=  16.1s
[CV 3/5] END ...C=1, gamma=auto, kernel=sigmoid;, score=0.752 total time=  13.1s
[CV 1/5] END ...C=2, gamma=scale, kernel=linear;, score=0.812 total time=   6.5s
[CV 2/5] END ...C=2, gamma=scale, kernel=linear;, score=0.836 total time=   7.5s
[CV 4/5] END ...C=1, gamma=auto, kernel=sigmoid;, score=0.764 total time=  12.2s
[CV 5/5] END ...C=1, gamma=auto, kernel=sigmoid;, score=0.759 total time=  15.0s
[CV 3/5] END ...C=2, gamma=scale, kernel=linear;, score=0.814 total time=   7.0s
[CV 5/5] END ...C=2, gamma=scale, kernel=linear;, score=0.823 total time=   5.8s
[CV 4/5] END ...C=2, gamma=scale, kernel=linear;, score=0.829 total time=   6.9s
[CV 1/5] END .....C=2, gamma=scale, kernel=poly;, score=0.805 total time=   7.6s
[CV 2/5] END .....C=2, gamma=scale, kernel=poly;, score=0.811 total time=   8.2s
[CV 3/5] END .....C=2, gamma=scale, kernel=poly;, score=0.797 total time=   7.0s
[CV 4/5] END .....C=2, gamma=scale, kernel=poly;, score=0.799 total time=   6.6s
[CV 5/5] END .....C=2, gamma=scale, kernel=poly;, score=0.809 total time=   2.9s
[CV 3/5] END ......C=2, gamma=scale, kernel=rbf;, score=0.844 total time=   2.9s
[CV 1/5] END ......C=2, gamma=scale, kernel=rbf;, score=0.856 total time=   4.5s
[CV 4/5] END ......C=2, gamma=scale, kernel=rbf;, score=0.860 total time=   2.9s
[CV 2/5] END ......C=2, gamma=scale, kernel=rbf;, score=0.864 total time=   4.6s
[CV 5/5] END ......C=2, gamma=scale, kernel=rbf;, score=0.861 total time=   2.7s
[CV 2/5] END ..C=2, gamma=scale, kernel=sigmoid;, score=0.394 total time=   3.0s
[CV 4/5] END ..C=2, gamma=scale, kernel=sigmoid;, score=0.449 total time=   3.3s
[CV 1/5] END ..C=2, gamma=scale, kernel=sigmoid;, score=0.396 total time=   5.5s
[CV 3/5] END ..C=2, gamma=scale, kernel=sigmoid;, score=0.409 total time=   5.5s
[CV 5/5] END ..C=2, gamma=scale, kernel=sigmoid;, score=0.376 total time=   3.2s
[CV 1/5] END ....C=2, gamma=auto, kernel=linear;, score=0.812 total time=   1.6s
[CV 2/5] END ....C=2, gamma=auto, kernel=linear;, score=0.836 total time=   2.4s
[CV 4/5] END ....C=2, gamma=auto, kernel=linear;, score=0.829 total time=   1.8s
[CV 3/5] END ....C=2, gamma=auto, kernel=linear;, score=0.814 total time=   2.6s
[CV 5/5] END ....C=2, gamma=auto, kernel=linear;, score=0.823 total time=   3.0s
[CV 1/5] END ......C=2, gamma=auto, kernel=poly;, score=0.510 total time=   7.2s
[CV 2/5] END ......C=2, gamma=auto, kernel=poly;, score=0.562 total time=   7.7s
[CV 4/5] END ......C=2, gamma=auto, kernel=poly;, score=0.514 total time=   7.6s
[CV 3/5] END ......C=2, gamma=auto, kernel=poly;, score=0.544 total time=  10.0s
[CV 1/5] END .......C=2, gamma=auto, kernel=rbf;, score=0.820 total time=   3.6s
[CV 2/5] END .......C=2, gamma=auto, kernel=rbf;, score=0.809 total time=   4.0s
[CV 5/5] END ......C=2, gamma=auto, kernel=poly;, score=0.534 total time=   6.9s
[CV 3/5] END .......C=2, gamma=auto, kernel=rbf;, score=0.796 total time=   3.9s
[CV 4/5] END .......C=2, gamma=auto, kernel=rbf;, score=0.806 total time=   3.2s
[CV 5/5] END .......C=2, gamma=auto, kernel=rbf;, score=0.787 total time=   4.5s
[CV 3/5] END ...C=2, gamma=auto, kernel=sigmoid;, score=0.776 total time=   6.1s
[CV 1/5] END ...C=2, gamma=auto, kernel=sigmoid;, score=0.801 total time=   7.0s
[CV 2/5] END ...C=2, gamma=auto, kernel=sigmoid;, score=0.792 total time=   8.2s
[CV 4/5] END ...C=2, gamma=auto, kernel=sigmoid;, score=0.794 total time=   9.3s
[CV 1/5] END ...C=3, gamma=scale, kernel=linear;, score=0.814 total time=   6.5s
[CV 2/5] END ...C=3, gamma=scale, kernel=linear;, score=0.838 total time=   5.2s
[CV 5/5] END ...C=2, gamma=auto, kernel=sigmoid;, score=0.775 total time=   9.4s
[CV 3/5] END ...C=3, gamma=scale, kernel=linear;, score=0.810 total time=   4.8s
[CV 4/5] END ...C=3, gamma=scale, kernel=linear;, score=0.828 total time=   5.8s
[CV 5/5] END ...C=3, gamma=scale, kernel=linear;, score=0.820 total time=   6.2s
[CV 1/5] END .....C=3, gamma=scale, kernel=poly;, score=0.815 total time=   4.8s
[CV 2/5] END .....C=3, gamma=scale, kernel=poly;, score=0.820 total time=   6.8s
[CV 3/5] END .....C=3, gamma=scale, kernel=poly;, score=0.806 total time=   5.6s
[CV 4/5] END .....C=3, gamma=scale, kernel=poly;, score=0.805 total time=   5.5s
[CV 5/5] END .....C=3, gamma=scale, kernel=poly;, score=0.816 total time=   5.8s
[CV 1/5] END ......C=3, gamma=scale, kernel=rbf;, score=0.859 total time=   8.2s
[CV 2/5] END ......C=3, gamma=scale, kernel=rbf;, score=0.864 total time=   8.6s
[CV 3/5] END ......C=3, gamma=scale, kernel=rbf;, score=0.845 total time=   8.6s
[CV 4/5] END ......C=3, gamma=scale, kernel=rbf;, score=0.860 total time=   7.8s
[CV 1/5] END ..C=3, gamma=scale, kernel=sigmoid;, score=0.386 total time=   8.5s
[CV 3/5] END ..C=3, gamma=scale, kernel=sigmoid;, score=0.401 total time=   7.6s
[CV 5/5] END ......C=3, gamma=scale, kernel=rbf;, score=0.869 total time=   9.8s
[CV 2/5] END ..C=3, gamma=scale, kernel=sigmoid;, score=0.388 total time=  10.0s
[CV 1/5] END ....C=3, gamma=auto, kernel=linear;, score=0.814 total time=   5.7s
[CV 2/5] END ....C=3, gamma=auto, kernel=linear;, score=0.838 total time=   6.3s
[CV 5/5] END ..C=3, gamma=scale, kernel=sigmoid;, score=0.379 total time=   9.2s
[CV 4/5] END ..C=3, gamma=scale, kernel=sigmoid;, score=0.450 total time=   9.3s
[CV 3/5] END ....C=3, gamma=auto, kernel=linear;, score=0.810 total time=   5.5s
[CV 4/5] END ....C=3, gamma=auto, kernel=linear;, score=0.828 total time=   6.7s
[CV 5/5] END ....C=3, gamma=auto, kernel=linear;, score=0.820 total time=   6.2s
[CV 1/5] END ......C=3, gamma=auto, kernel=poly;, score=0.562 total time=  11.1s
[CV 2/5] END ......C=3, gamma=auto, kernel=poly;, score=0.546 total time=   9.6s
[CV 3/5] END ......C=3, gamma=auto, kernel=poly;, score=0.557 total time=   8.5s
[CV 4/5] END ......C=3, gamma=auto, kernel=poly;, score=0.532 total time=   8.4s
[CV 1/5] END .......C=3, gamma=auto, kernel=rbf;, score=0.831 total time=   5.4s
[CV 3/5] END .......C=3, gamma=auto, kernel=rbf;, score=0.804 total time=   3.5s
[CV 5/5] END ......C=3, gamma=auto, kernel=poly;, score=0.542 total time=   7.0s
[CV 2/5] END .......C=3, gamma=auto, kernel=rbf;, score=0.825 total time=   4.3s
[CV 5/5] END .......C=3, gamma=auto, kernel=rbf;, score=0.810 total time=   3.0s
[CV 1/5] END ...C=3, gamma=auto, kernel=sigmoid;, score=0.810 total time=   3.0s
[CV 4/5] END .......C=3, gamma=auto, kernel=rbf;, score=0.823 total time=   4.8s
[CV 2/5] END ...C=3, gamma=auto, kernel=sigmoid;, score=0.801 total time=   4.3s
[CV 3/5] END ...C=3, gamma=auto, kernel=sigmoid;, score=0.784 total time=   3.2s
[CV 4/5] END ...C=3, gamma=auto, kernel=sigmoid;, score=0.805 total time=   3.2s
[CV 1/5] END ...C=4, gamma=scale, kernel=linear;, score=0.814 total time=   2.9s
[CV 2/5] END ...C=4, gamma=scale, kernel=linear;, score=0.833 total time=   1.9s
[CV 3/5] END ...C=4, gamma=scale, kernel=linear;, score=0.810 total time=   1.9s
[CV 5/5] END ...C=3, gamma=auto, kernel=sigmoid;, score=0.784 total time=   4.2s
[CV 1/5] END .....C=4, gamma=scale, kernel=poly;, score=0.820 total time=   2.0s
[CV 4/5] END ...C=4, gamma=scale, kernel=linear;, score=0.821 total time=   2.4s
[CV 5/5] END ...C=4, gamma=scale, kernel=linear;, score=0.819 total time=   2.1s
[CV 2/5] END .....C=4, gamma=scale, kernel=poly;, score=0.830 total time=   3.5s
[CV 3/5] END .....C=4, gamma=scale, kernel=poly;, score=0.812 total time=   2.1s
[CV 5/5] END .....C=4, gamma=scale, kernel=poly;, score=0.828 total time=   2.1s
[CV 4/5] END .....C=4, gamma=scale, kernel=poly;, score=0.801 total time=   3.1s
[CV 1/5] END ......C=4, gamma=scale, kernel=rbf;, score=0.861 total time=   3.4s
[CV 3/5] END ......C=4, gamma=scale, kernel=rbf;, score=0.849 total time=   3.3s
[CV 2/5] END ......C=4, gamma=scale, kernel=rbf;, score=0.868 total time=   3.5s
[CV 4/5] END ......C=4, gamma=scale, kernel=rbf;, score=0.864 total time=   3.3s
[CV 1/5] END ..C=4, gamma=scale, kernel=sigmoid;, score=0.384 total time=   3.5s
[CV 2/5] END ..C=4, gamma=scale, kernel=sigmoid;, score=0.400 total time=   4.0s
[CV 5/5] END ......C=4, gamma=scale, kernel=rbf;, score=0.870 total time=   4.2s
[CV 3/5] END ..C=4, gamma=scale, kernel=sigmoid;, score=0.396 total time=   4.2s
[CV 1/5] END ....C=4, gamma=auto, kernel=linear;, score=0.814 total time=   5.8s
[CV 2/5] END ....C=4, gamma=auto, kernel=linear;, score=0.833 total time=   6.9s
[CV 4/5] END ..C=4, gamma=scale, kernel=sigmoid;, score=0.444 total time=   8.8s
[CV 5/5] END ..C=4, gamma=scale, kernel=sigmoid;, score=0.376 total time=  10.3s
[CV 3/5] END ....C=4, gamma=auto, kernel=linear;, score=0.810 total time=   6.0s
[CV 5/5] END ....C=4, gamma=auto, kernel=linear;, score=0.819 total time=   5.2s
[CV 4/5] END ....C=4, gamma=auto, kernel=linear;, score=0.821 total time=   6.0s
[CV 1/5] END ......C=4, gamma=auto, kernel=poly;, score=0.574 total time=  16.4s
[CV 2/5] END ......C=4, gamma=auto, kernel=poly;, score=0.568 total time=  16.7s
[CV 3/5] END ......C=4, gamma=auto, kernel=poly;, score=0.580 total time=  15.8s
[CV 4/5] END ......C=4, gamma=auto, kernel=poly;, score=0.554 total time=  19.0s
[CV 2/5] END .......C=4, gamma=auto, kernel=rbf;, score=0.830 total time=   7.4s
[CV 1/5] END .......C=4, gamma=auto, kernel=rbf;, score=0.831 total time=   9.8s
[CV 3/5] END .......C=4, gamma=auto, kernel=rbf;, score=0.807 total time=   9.3s
[CV 5/5] END ......C=4, gamma=auto, kernel=poly;, score=0.576 total time=  16.8s
[CV 4/5] END .......C=4, gamma=auto, kernel=rbf;, score=0.826 total time=   8.4s
[CV 5/5] END .......C=4, gamma=auto, kernel=rbf;, score=0.821 total time=   8.9s
[CV 1/5] END ...C=4, gamma=auto, kernel=sigmoid;, score=0.818 total time=   7.2s
[CV 2/5] END ...C=4, gamma=auto, kernel=sigmoid;, score=0.809 total time=   8.5s
[CV 3/5] END ...C=4, gamma=auto, kernel=sigmoid;, score=0.789 total time=   7.2s
[CV 4/5] END ...C=4, gamma=auto, kernel=sigmoid;, score=0.810 total time=   6.7s
[CV 5/5] END ...C=4, gamma=auto, kernel=sigmoid;, score=0.790 total time=   5.0s

# Best Model
# Extract the attributes of the best model.

#===========================Fill This===========================#
# 1. Extract the best score.
# 2. Extract a dictionary of the parameters.
# !! Use the attributes of the `oGsSvc` object.
bestScore   = oGsSvc.best_score_
dBestParams = oGsSvc.best_params_
#===============================================================#

print(f'The best model had the following parameters: {dBestParams} with the CV score: {bestScore:0.2%}')

The best model had the following parameters: {'C': 4, 'gamma': 'scale', 'kernel': 'rbf'} with the CV score: 86.23%

(#) In production one would visualize the effect of each parameter on the model result. Then use it to fine tune farther the parameters.

oGsSvc.score(mXTest, vYTest) ## will be the same like extracting best model

0.85

# The Best Model

#===========================Fill This===========================#
# 1. Extract the best model.
# 2. Score the best model on the test data set.
bestModel = oGsSvc.best_estimator_
modelScore = bestModel.score(mXTest, vYTest)
#===============================================================#

print(f'The model score (Accuracy) on the data: {modelScore:0.2%}') #<! Accuracy

The model score (Accuracy) on the data: 85.00%

(#) With proper tuning one can improve the baseline model by ~5%.

Train the Best Model on the Train Data Set#

In production we take the optimal Hyper Parameters and then retrain the model on the whole training data set.
This is the model we’ll use in production.

# The Model with Optimal Parameters

#===========================Fill This===========================#
# 1. Construct the model.
# 2. Train the model.
oSvmCls = bestModel.fit(mXTrain, vYTrain)
#===============================================================#

modelScore = oSvmCls.score(mXTest, vYTest)

print(f'The model score (Accuracy) on the data: {modelScore:0.2%}') #<! Accuracy

The model score (Accuracy) on the data: 85.00%

(?) Is the value above exactly as the value from the best model of the grid search? If so, look at the refit parameter of GridSearchCV.

Performance Metrics / Scores#

In this section we’ll analyze the model using the confusion matrix.

Plot the Confusion Matrix#

# Plot the Confusion Matrix
hF, hA = plt.subplots(figsize = (10, 10))

#===========================Fill This===========================#
# 1. Plot the confusion matrix for the best model.
# 2. Use the data labels (`L_CLASSES`).
hA, mConfMat = PlotConfusionMatrix(vYTest, oSvmCls.predict(mXTest),lLabels= L_CLASSES, hA = hA)
#===============================================================#

plt.show()

../../../../_images/99219a233fa53637e8f741036176be93cbd81b31eca92425481ca856e5f1fd62.png

(?) Which class has the best accuracy?
(?) Which class has a dominant false prediction? Does it make sense?
(?) What’s the difference between \(p \left( \hat{y}_{i} = \text{coat} \mid {x}_{i} = \text{coat} \right)\) to \(p \left( {y}_{i} = \text{coat} \mid \hat{y}_{i} = \text{coat} \right)\)?

recall and presicion

(!) Make the proper calculations on mConfMat or the function PlotConfusionMatrix to answer the questions above.

# Plot the Confusion Matrix
hF, hA = plt.subplots(figsize = (10, 10))

#===========================Fill This===========================#
# 1. Plot the confusion matrix for the best model.
# 2. Use the data labels (`L_CLASSES`).
hA, mConfMat = PlotConfusionMatrix(vYTest, oSvmCls.predict(mXTest),lLabels= L_CLASSES,normMethod= 'true' , hA = hA)
#===============================================================#

plt.show()

../../../../_images/9dd25d755f37965bd19bbbd1156fe2461a0c419fa9ac28f754b7b430cebc7f22.png