Deep Learning Tutorial: Image Classification using CNN

Deep Learning Tutorial: Image Classification using CNN

Table of Contents

  1. Introduction
  2. Building a Convolutional Neural Network
  3. Loading and Preprocessing the Dataset
  4. Observing the Dataset
  5. Normalizing the Data
  6. Building the Model
  7. Compiling and Training the Model
  8. Making Predictions
  9. Evaluating the Model
    • Classification Report
    • Confusion Matrix
  10. Conclusion

🧠 Introduction

In this Tutorial, we will learn how to build a simple convolutional neural network (CNN) using the Keras and TensorFlow libraries. We will be using the CIFAR-10 dataset, which consists of 60,000 32x32 pixel images in 10 different classes. The goal is to build a model that can accurately classify the images into their respective classes. Before we dive into the code, it is important to have a basic understanding of neural networks and CNNs.

If you are new to neural networks, I have another article that provides an introduction to the topic. You can find the link to that article in the description below. It covers key concepts such as padding, Stride, multi-Channel images, and more.

🏗️ Building a Convolutional Neural Network

First, we need to install and import the necessary libraries. We will be using TensorFlow and Keras for this project. To install them, run the following command:

pip install tensorflow keras

Next, let's import the required libraries:

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
import os

Now that we have our libraries set up, we can proceed to load and preprocess the dataset.

📚 Loading and Preprocessing the Dataset

To load the CIFAR-10 dataset, we will use the tf.keras.datasets module. The dataset consists of 50,000 training images and 10,000 test images, each belonging to one of the 10 classes. We will split the dataset into training and test sets and normalize the pixel values to be between 0 and 1.

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

# Normalize the pixel values
x_train = x_train / 255.0
x_test = x_test / 255.0

Let's take a look at the Shape of our training and test data:

print("Training data shape:", x_train.shape)
print("Training labels shape:", y_train.shape)
print("Test data shape:", x_test.shape)
print("Test labels shape:", y_test.shape)

The output would be:

Training data shape: (50000, 32, 32, 3)
Training labels shape: (50000, 1)
Test data shape: (10000, 32, 32, 3)
Test labels shape: (10000, 1)

As you can see, our training data consists of 50,000 images with Dimensions 32x32 pixels and 3 color channels (RGB), and the corresponding labels are stored in a separate array.

Now that we have our data loaded and preprocessed, let's move on to building the model.

🏭 Building the Model

We will build a simple CNN architecture consisting of multiple convolutional layers, followed by max pooling layers, and finally fully connected (dense) layers. We will use the Sequential API provided by Keras to build our model.

model = keras.Sequential()

# Add convolutional layers
model.add(keras.layers.Conv2D(filters=32, kernel_size=(3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))

model.add(keras.layers.Conv2D(filters=64, kernel_size=(4, 4), activation='relu'))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))

# Flatten the output and add dense layers
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(units=84, activation='relu'))
model.add(keras.layers.Dense(units=10, activation='softmax'))

In our model, we have added two convolutional layers with ReLU activation followed by max pooling layers. We then flatten the output and add two dense layers with ReLU activation, and the final output layer with softmax activation to classify the images into the 10 different classes.

Now that our model is built, we need to compile it before training.

🏋️ Compiling and Training the Model

Before training the model, we need to compile it by specifying the loss function, optimizer, and metrics to evaluate the model's performance.

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

Next, we can train the model using the fit method. We will pass in the training data, training labels, validation data, validation labels, and the number of epochs.

history = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=5)

The fit method will iterate over the training data for the specified number of epochs and update the model's weights to minimize the loss. You can adjust the number of epochs based on your needs.

After training, we can make predictions using the trained model.

🔮 Making Predictions

To make predictions using our trained model, we can use the predict method and pass in the test data.

y_pred = model.predict(x_test)

The predict method will return an array of predicted probabilities for each class. To get the predicted class labels, we can use the argmax function.

predicted_labels = np.argmax(y_pred, axis=1)

Now we have an array of predicted labels, let's evaluate the model's performance.

📊 Evaluating the Model

There are several ways to evaluate the performance of a model. One common approach is to use the Classification Report and Confusion Matrix.

Classification Report

The Classification Report provides metrics such as precision, recall, and F1-score for each class. We can generate a classification report using the classification_report function from the sklearn.metrics module.

from sklearn.metrics import classification_report

print(classification_report(y_test, predicted_labels))

The output will show precision, recall, F1-score, and support for each class.

Confusion Matrix

A confusion matrix visualizes the performance of a classification model by showing the number of correct and incorrect predictions for each class. We can generate a confusion matrix using the confusion_matrix function from the sklearn.metrics module.

from sklearn.metrics import confusion_matrix
import seaborn as sns

cm = confusion_matrix(y_test, predicted_labels)

plt.figure(figsize=(14, 7))
sns.heatmap(cm, annot=True, fmt='d')
plt.ylabel('True Labels')
plt.xlabel('Predicted Labels')
plt.title('Confusion Matrix')
plt.show()

The confusion matrix will be displayed as a heatmap, where the x-axis represents the predicted labels and the y-axis represents the true labels. The numbers in each cell represent the count of predictions.

🎉 Conclusion

In this tutorial, we have learned how to build a simple convolutional neural network using Keras and TensorFlow. We loaded and preprocessed the CIFAR-10 dataset, built the model architecture, compiled and trained the model, made predictions, and evaluated the model's performance using the Classification Report and Confusion Matrix. Building and training a CNN is an iterative process, and you can experiment with different architectures, hyperparameters, and optimization techniques to improve the model's performance.

Remember to keep the model's complexity in check to avoid overfitting and ensure you have enough data to train on. Additionally, consider using data augmentation techniques and exploring transfer learning for better results.

Thank you for following along, and happy coding!

Find AI tools in Toolify

Join TOOLIFY to find the ai tools

Get started

Sign Up
App rating
4.9
AI Tools
20k+
Trusted Users
5000+
No complicated
No difficulty
Free forever
Browse More Content