CNNs for Pet Breed Classification (29 chars)

Generated from prompt:

A 15-slide presentation titled 'Image Classification with CNNs' for a university project in computer vision. The slides include: 1) Title and intro, 2) Motivation & Goal, 3) Dataset overview (Oxford-IIIT Pets), 4) Preprocessing, 5) CNN Basics, 6) Transfer Learning, 7) Models Used, 8) Custom CNN, 9) Training Setup, 10) Training Curves, 11) Confusion Matrices, 12) Performance Comparison, 13) Key Findings, 14) Limitations & Future Work, and 15) Conclusion. Each slide includes both short bullet points (for slide text) and a speaker script for presentation delivery. The design should be clean, academic, and visual, with consistent layout and modern color palette.

University project presentation on classifying 37 pet breeds from Oxford-IIIT Pets dataset using CNNs. Covers preprocessing, basics, transfer learning (VGG/ResNet), custom CNN, training curves, confus

December 14, 202515 slides
Slide 1 of 15

Slide 1 - Image Classification with CNNs

This is a title slide for a presentation on "Image Classification with CNNs." It features the subtitle "University Project" along with placeholders for "Your Name, Date."

University Project

Your Name, Date

Speaker Notes
Welcome! Today, we'll explore CNNs for pet image classification using Oxford-IIIT Pets dataset, from basics to results.
Slide 1 - Image Classification with CNNs
Slide 2 of 15

Slide 2 - Motivation & Goal

Accurate pet breed identification is vital for vets and apps, motivating the goal to build and train CNNs on 37 breeds. The target is over 80% accuracy, comparing custom versus transfer learning models.

Motivation & Goal

  • Accurate pet breed ID vital for vets/apps
  • Goal: Build/train CNNs on 37 breeds
  • Target: Achieve >80% accuracy
  • Compare custom vs. transfer learning models
Speaker Notes
Motivation: Automate breed detection. Goal: Compare custom/transfer models on Pets dataset.
Slide 2 - Motivation & Goal
Slide 3 of 15

Slide 3 - Dataset Overview (Oxford-IIIT Pets)

The slide "Dataset Overview (Oxford-IIIT Pets)" features key statistics on the left: 7,349 images across 37 cat and dog breeds (~200 per breed) with fine head/pose annotations. The right side displays diverse sample images highlighting breed variations in appearance and poses for robust training.

Dataset Overview (Oxford-IIIT Pets)

Key StatisticsSample Images

| 7,349 images 37 breeds (cats & dogs) ~200 images per breed Fine annotations (head/pose masks) | Diverse examples of cat and dog breeds from the dataset, highlighting variations in appearance and poses for robust training. |

Source: Oxford-IIIT Pets Dataset

Speaker Notes
Oxford-IIIT Pets: 37 cat/dog breeds, head/pose masks. ~200 imgs/breed for robust training.
Slide 3 - Dataset Overview (Oxford-IIIT Pets)
Slide 4 of 15

Slide 4 - Preprocessing

The Preprocessing workflow slide outlines steps for handling raw Oxford-IIIT Pets images. It covers resizing to 224x224, normalizing pixels to [0,1], augmenting with rotations/flips for variety and class balance, and converting to framework tensors like PyTorch.

Preprocessing

{ "headers": [ "Step", "Operation", "Details" ], "rows": [ [ "Raw", "Input", "Original Oxford-IIIT Pets images" ], [ "Resize", "Scale", "To 224x224 (standard CNN input)" ], [ "Normalize", "Scale pixels", "To [0,1] range for stability" ], [ "Augment", "Apply transformations", "Rotation/Flip to add variety & balance classes" ], [ "Tensors", "Convert", "To framework tensors (e.g., PyTorch)" ] ] }

Source: Workflow: Raw → Resize(224x224) → Normalize → Augment(Rot/Flip) → Tensors

Speaker Notes
Steps: Resize to 224x224, normalize [0,1], augment for variety, convert to tensors. Handles class imbalance. (135 chars)
Slide 4 - Preprocessing
Slide 5 of 15

Slide 5 - CNN Basics

CNN Basics slide outlines key components: convolutional layers extract features like edges and textures, pooling layers downsample spatial dimensions, and fully connected layers classify via softmax probabilities. ReLU activation introduces non-linearity, while backpropagation trains network weights.

CNN Basics

  • Convolutional layers: Extract features (edges, textures)
  • Pooling layers: Downsample spatial dimensions
  • Fully connected layers: Classify via softmax probabilities
  • ReLU activation: Introduce non-linearity
  • Backpropagation: Train network weights

Source: Image Classification with CNNs

Speaker Notes
CNNs: Convs detect edges/textures, pooling reduces dims, FC outputs probs via softmax. Backprop trains.
Slide 5 - CNN Basics
Slide 6 of 15

Slide 6 - Transfer Learning

This slide on Transfer Learning shows an image with key steps: freeze the pretrained base and fine-tune the classifier. It emphasizes that this method needs fewer data and epochs.

Transfer Learning

!Image

  • Freeze pretrained base
  • Fine-tune classifier
  • Fewer data/epochs needed

Source: Wikipedia - Transfer learning

Speaker Notes
Use ImageNet-pretrained nets like VGG16/ResNet50. Freeze early layers, train top for pets. Faster convergence.
Slide 6 - Transfer Learning
Slide 7 of 15

Slide 7 - Models Used

The "Models Used" slide features a grid showcasing four key models: VGG16 Backbone with ImageNet-pretrained convolutions, ResNet50 Residuals using skip connections for deep training, EfficientNet Transfer for balanced accuracy and efficiency, and a Custom CNN optimized for pet classification. Each entry includes an icon and a concise description of its strengths.

Models Used

{ "features": [ { "icon": "🧱", "heading": "VGG16 Backbone", "description": "Deep stacked convolutions pretrained on ImageNet for features." }, { "icon": "šŸ”—", "heading": "ResNet50 Residuals", "description": "Skip connections enable very deep network training." }, { "icon": "⚔", "heading": "EfficientNet Transfer", "description": "Balances accuracy, efficiency using transfer learning." }, { "icon": "šŸ”§", "heading": "Custom CNN Design", "description": "Tailored architecture optimized for pet classification." } ] }

Speaker Notes
Compared: VGG16, ResNet50, EfficientNet (transfer), plus custom CNN. Selected for depth/performance balance.
Slide 7 - Models Used
Slide 8 of 15

Slide 8 - Custom CNN

The Custom CNN slide depicts a neural network architecture with three convolutional blocks (64-128-256 filters), each followed by max pooling. It concludes with two fully connected layers (512-37 neurons) and dropout at a rate of 0.5.

Custom CNN

!Image

  • 3 Convolutional blocks (64-128-256 filters)
  • Max Pooling after each conv block
  • 2 Fully Connected layers (512-37)
  • Dropout with rate 0.5

Source: Wikipedia

Speaker Notes
Custom: 3 conv (64-128-256 filters), pools, FC(512-37), dropout. Simple yet effective baseline.
Slide 8 - Custom CNN
Slide 9 of 15

Slide 9 - Training Setup

The "Training Setup" slide outlines key hyperparameters in a table format. It specifies 50 epochs, a 1e-4 learning rate, batch size of 32, Adam optimizer, and Categorical Cross-Entropy loss.

Training Setup

{ "headers": [ "Hyperparameter", "Value" ], "rows": [ [ "Epochs", "50" ], [ "Learning Rate (LR)", "1e-4" ], [ "Batch Size", "32" ], [ "Optimizer", "Adam" ], [ "Loss", "Categorical Cross-Entropy" ] ] }

Speaker Notes
Trained 50 epochs, Adam 1e-4 LR, batch 32 on GPU. 80/20 train/val split, early stop.
Slide 9 - Training Setup
Slide 10 of 15

Slide 10 - Training Curves

The Training Curves slide reports a peak validation accuracy of 82% for the from-scratch model and 89% for VGG transfer learning using pre-trained weights. Overfitting risk is low, mitigated by dropout layers.

Training Curves

  • 82%: Custom Val Acc
  • Peak for from-scratch model

  • 89%: VGG Transfer Acc
  • Boost from pre-trained weights

  • Low: Overfitting Risk
  • Mitigated by dropout layers

Speaker Notes
Curves show convergence: Custom peaks ~82%, transfer models higher ~89%. Minimal overfitting via dropout. (132 chars)
Slide 10 - Training Curves
Slide 11 of 15

Slide 11 - Confusion Matrices

The slide displays confusion matrices for VGG and ResNet on cat breed classification. VGG and ResNet excel on similar breeds like Siamese and Persian, with errors in fine-grained distinctions and VGG showing the strongest diagonal performance in heatmaps.

Confusion Matrices

!Image

  • VGG and ResNet excel on similar breeds like Siamese and Persian
  • Common errors occur in fine-grained cat breed distinctions
  • VGG demonstrates strongest diagonal performance in heatmaps

Source: Confusion matrix

Speaker Notes
Confusion matrices: VGG/ResNet excel on similar breeds (e.g., Siamese/Persian). Errors in fine-grained cats.
Slide 11 - Confusion Matrices
Slide 12 of 15

Slide 12 - Performance Comparison

The "Performance Comparison" slide features a table comparing Top-1 and Top-5 accuracies across three models. ResNet leads with 91.2% Top-1 and 99% Top-5, followed by VGG16 at 89.3% and 98%, and Custom at 82.1% and 95%.

Performance Comparison

{ "headers": [ "Model", "Top-1", "Top-5" ], "rows": [ [ "Custom", "82.1%", "95%" ], [ "VGG16", "89.3%", "98%" ], [ "ResNet", "91.2%", "99%" ] ] }

Speaker Notes
ResNet50 tops at 91.2% Top-1, custom solid baseline. Transfer learning boosts +9%. (110 chars)
Slide 12 - Performance Comparison
Slide 13 of 15

Slide 13 - Key Findings

Transfer learning outperforms custom CNNs, deeper networks improve performance, and data augmentation prevents overfitting. The Pets dataset poses challenges for fine-grained recognition.

Key Findings

  • Transfer learning outperforms custom CNNs
  • Deeper networks improve performance
  • Data augmentation prevents overfitting
  • Pets dataset challenges fine-grained recognition
Speaker Notes
Key: Transfer learning shines, depth helps, data aug reduces overfit. Pets dataset tough due to subtle diffs.
Slide 13 - Key Findings
Slide 14 of 15

Slide 14 - Limitations & Future Work

The slide highlights limitations such as a small dataset causing breed confusion, imbalance restricting performance, and unutilized animal poses. Future work proposes larger balanced datasets, Vision Transformers (ViT), and ensemble methods for state-of-the-art results.

Limitations & Future Work

  • Small dataset causes breed confusion
  • Dataset imbalance limits performance
  • Animal poses not utilized
  • Future: Larger, balanced datasets
  • Future: Vision Transformers (ViT)
  • Future: Ensemble methods for SOTA
Speaker Notes
Limits: Imbalance, pose ignored. Future: Larger sets, Vision Transformers, ensemble for SOTA. (118 chars)
Slide 14 - Limitations & Future Work
Slide 15 of 15

Slide 15 - Conclusion

The conclusion slide emphasizes that CNNs are powerful for images, transfer learning is efficient, and 91% accuracy was achieved on the Pets dataset. It concludes with an invitation for Q&A.

Conclusion

• CNNs powerful for images

  • Transfer learning efficient
  • Achieved 91% on Pets

Q&A?

Speaker Notes
Summary: Built strong classifiers, transfer key. Great results for uni project! Thank you for your attention. Any questions?
Slide 15 - Conclusion

Discover More Presentations

Explore thousands of AI-generated presentations for inspiration

Browse Presentations
Powered by AI

Create Your Own Presentation

Generate professional presentations in seconds with Karaf's AI. Customize this presentation or start from scratch.

Create New Presentation

Powered by Karaf.ai — AI-Powered Presentation Generator