Building Your First Image Classifier: Cat vs. Dog with Python and Scikit-Learn

Image classification is one of the most exciting fields in Artificial Intelligence. While deep learning models like Convolutional Neural Networks (CNNs) are the standard for professional computer vision, you can actually build a functional image classifier using traditional machine learning techniques. In this tutorial, we will learn how to build a simple AI model to distinguish between cats and dogs using Python and the popular Scikit-Learn (sklearn) library.

Why Use Scikit-Learn for Image Classification?

Scikit-Learn is the gold standard for traditional machine learning. While it isn't specifically designed for high-end computer vision, using it to classify images is a fantastic way to understand the fundamentals of data preprocessing, feature extraction, and model evaluation without getting bogged down in the complexity of neural network architectures.

What You Will Learn:

How to convert images into numerical data.
How to resize and normalize images for machine learning.
How to train a Support Vector Machine (SVM) model.
How to evaluate the performance of your classifier.

Prerequisites and Environment Setup

Before we start coding, you need to have Python installed on your machine. We will also need a few specific libraries. You can install them using pip:

pip install numpy scikit-learn opencv-python matplotlib

NumPy: For handling numerical arrays.
OpenCV (cv2): For image processing and loading.
Scikit-Learn: For the machine learning model and data splitting.
Matplotlib: For visualizing our results.

Step 1: Preparing the Dataset

For this project, you need a folder of cat images and a folder of dog images. You can download the "Cats vs Dogs" dataset from Kaggle. To keep things simple for this tutorial, we will resize all images to a standard 64x64 pixel resolution.

The core idea is to read every image, convert it to grayscale (to reduce complexity), resize it, and flatten it into a 1D array that the machine learning model can understand.

import os
import cv2
import numpy as np
from sklearn.model_selection import train_test_split

def load_data(data_path):
    categories = ['cat', 'dog']
    data = []
    labels = []

    for category in categories:
        path = os.path.join(data_path, category)
        label = categories.index(category)

        for img in os.listdir(path):
            try:
                # Load image in grayscale
                img_path = os.path.join(path, img)
                image = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
                # Resize to 64x64
                image = cv2.resize(image, (64, 64))
                # Flatten the image into a 1D array
                image = image.flatten()
                
                data.append(image)
                labels.append(label)
            except Exception as e:
                pass
                
    return np.array(data), np.array(labels)

# Usage (Update 'data_folder' with your path)
# X, y = load_data('path_to_your_dataset')

Step 2: Splitting Data for Training and Testing

It is crucial to never test your model on the same data it learned from. We split our data into a "Training Set" to teach the model and a "Testing Set" to see how well it performs on unseen images.

# Splitting the data: 80% for training, 20% for testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Normalize the data (Scaling pixel values from 0-255 to 0-1)
X_train = X_train / 255.0
X_test = X_test / 255.0

Step 3: Creating and Training the Model

We will use a Support Vector Machine (SVM) for this task. SVMs are powerful classifiers that work well with high-dimensional data (like flattened images).

from sklearn.svm import SVC

# Initialize the SVM model
model = SVC(kernel='poly', gamma='auto')

# Train the model
print("Training started... This may take a few minutes.")
model.fit(X_train, y_train)
print("Training complete!")

Step 4: Evaluating the Results

Once the model is trained, we need to check its accuracy. We will use the testing set we put aside earlier to see how many images it classifies correctly.

from sklearn.metrics import accuracy_score, classification_report

# Make predictions
y_pred = model.predict(X_test)

# Check accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy * 100:.2f}%")

# Detailed report
print(classification_report(y_test, y_pred, target_names=['Cat', 'Dog']))

Step 5: Visualizing Predictions

It is always helpful to see the results visually. Here is a small snippet to display one of the test images and the model's prediction.

import matplotlib.pyplot as plt

def predict_image(index):
    plt.imshow(X_test[index].reshape(64, 64), cmap='gray')
    prediction = model.predict([X_test[index]])
    label = 'Dog' if prediction[0] == 1 else 'Cat'
    plt.title(f"Prediction: {label}")
    plt.show()

# Predict the first image in the test set
predict_image(0)

Conclusion and Next Steps

You have just built a functional AI classifier from scratch! While the accuracy of a simple SVM might not match that of a massive Google AI model, you have successfully implemented the entire machine learning pipeline: data collection, preprocessing, training, and evaluation.

How to improve this model:

Increase Dataset Size: The more images the model sees, the better it gets.
Use Color: Instead of grayscale, try using RGB values (though this will require more processing power).
Hyperparameter Tuning: Experiment with different SVM kernels like 'rbf' or 'linear'.
Deep Learning: If you want higher accuracy, the next logical step is to explore Convolutional Neural Networks (CNNs) using TensorFlow or PyTorch.

Machine learning is a journey of constant experimentation. Keep tweaking your parameters, try different images, and most importantly, keep coding!

Search This Blog

ad