Build with AI Bootcamp - Workshop Guide

Section 0

Setup & Environment

We start by setting up Google Colab and enabling GPU. GPU is essential as it speeds up model training 10-100x compared to CPU.

Enable GPU

In Google Colab go to Runtime → Change runtime type
In Hardware accelerator select GPU (T4 is free)
Click Save

GPU Check

# Check if GPU is available
import tensorflow as tf

print("TensorFlow version:", tf.__version__)
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    print(f"GPU available! Found {len(gpus)} device(s)")
else:
    print("WARNING: GPU not found! Go to Runtime → Change runtime type → GPU")

Install & Import Libraries

# Installation
!pip install gradio gdown -q

# Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os, zipfile, gdown

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator

from sklearn.metrics import confusion_matrix, classification_report
from sklearn.model_selection import train_test_split
from PIL import Image

import warnings
warnings.filterwarnings('ignore')

plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = [12, 8]
plt.rcParams['font.size'] = 12

print("All libraries loaded successfully!")

Section 1

Download & Load Dataset

We use Fashion Product Images, real product photos. We work with 4 classes: T-shirt, Jeans, Sneakers, Jacket.

Download

FILE_ID = '1YpceXZvh-8idz-nKDJjUh3-nrlVvhyYY'
print("Downloading Fashion dataset...")

url = f'https://drive.google.com/uc?id={FILE_ID}'
output = 'fashion_dataset.zip'
gdown.download(url, output, quiet=False)

with zipfile.ZipFile(output, 'r') as zip_ref:
    zip_ref.extractall('.')
os.remove(output)
print("Dataset downloaded and extracted!")

Locate & Load

def pronadji_dataset():
    for folder in ['fashion_subset gdg', 'fashion_subset', 'fashion_dataset']:
        if os.path.exists(folder):
            if os.path.exists(os.path.join(folder, 'styles.csv')):
                return folder
    for item in os.listdir('.'):
        if os.path.isdir(item) and os.path.exists(os.path.join(item, 'styles.csv')):
            return item
    return None

DATASET_PATH = pronadji_dataset()
STYLES_PATH = os.path.join(DATASET_PATH, 'styles.csv')
IMAGES_PATH = os.path.join(DATASET_PATH, 'images')

df = pd.read_csv(STYLES_PATH, on_bad_lines='skip')
print(f"Loaded {len(df):,} products")

Select Classes & Load Images

Same code as Serbian version - code is language-independent. Copy from the Serbian page or the notebook.

Section 2

Preprocessing & Data Splitting

Before training we need to normalize the images and split data into training, validation and test sets.

Pixels have values 0-255. Neural networks work better with small numbers (0.0 - 1.0). What number should we divide by?

X = X.astype('float32') / _____  # <-- FILL IN: what number do we divide by?

Hint: What is the maximum pixel value? Dividing by it makes the max become 1.0.

Split data: 70% training, 15% validation, 15% test.

X_temp, X_test, y_temp, y_test = train_test_split(
    X, y,
    test_size=_____,      # <-- FILL IN: what percentage for test?
    random_state=42,
    stratify=_____         # <-- FILL IN: stratify by what? (labels)
)

Hint: test_size=0.15 means 15%. stratify=y preserves class proportions.

Section 3

CNN Architecture

CNN (Convolutional Neural Network) is a type of neural network specialized for images. Our model has 3 convolutional blocks with increasing filters: 32 → 64 → 128.

Layer	Function
Conv2D	Detects patterns (edges, textures, shapes)
BatchNormalization	Stabilizes and speeds up training
MaxPooling2D	Reduces dimensions, keeps important info
Dropout	Prevents overfitting (randomly disables neurons)
Dense	Fully connected layer for classification

The most important part! Fill in the blanks in Block 1 and the last layer. Blocks 2 and 3 are given as reference.

# Block 1: fill in filters, kernel, activation, pooling, dropout
layers.Conv2D(_____, (_____, _____), activation='_____', padding='same', input_shape=INPUT_SHAPE),
layers.MaxPooling2D((_____, _____)),
layers.Dropout(_____),

# Last layer: how many neurons? which activation?
layers.Dense(_____, activation='_____')

Hints: Block 1 has 32 filters, kernel 3x3, activation 'relu'. Pooling is 2x2. Dropout is 0.25. Last Dense has as many neurons as classes, with 'softmax' activation.

Section 4

Compilation & Training

Compilation tells the model how to learn: which algorithm, how to measure error, and what to track.

model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=_____),  # <-- learning rate
    loss='_____',                                          # <-- loss function
    metrics=['_____']                                      # <-- what to track?
)

Hint: learning_rate=0.001 | loss='sparse_categorical_crossentropy' | metrics=['accuracy']

Choose number of epochs and batch size. EarlyStopping will halt training if the model stops improving.

EPOCHS = _____       # <-- number of epochs (recommended: 25)
BATCH_SIZE = _____   # <-- batch size (recommended: 32)

Section 5

Model Evaluation

Time to see how good our model is on data it has never seen, the test set.

Code is the same as in the Serbian version. Copy from the notebook or the Serbian page.

Section 6

AI Limitations - the model can't say "I don't know"

Key lesson: The model ALWAYS picks one of the 4 classes, even for completely meaningless images. It cannot say "I don't know". This is a fundamental limitation of classification models.

Section 7

Data Augmentation

Augmentation artificially creates new images from existing ones: rotates, shifts, zooms, flips. The model sees "new" images every epoch, which helps generalization.

Fill in the parameters. Be careful, too aggressive augmentation can hurt results!

datagen = ImageDataGenerator(
    rotation_range=_____,        # <-- degrees (recommended: 10)
    width_shift_range=_____,     # <-- horizontal shift (recommended: 0.1)
    height_shift_range=_____,    # <-- vertical shift (recommended: 0.1)
    zoom_range=_____,            # <-- zoom (recommended: 0.1)
    horizontal_flip=_____,       # <-- True or False?
    fill_mode='nearest'
)

Note: Augmentation isn't always better! If too aggressive or dataset too small, it can worsen results. That's a normal part of the ML process, we learn from mistakes too.

Section 8

Deploy with Gradio

Final step, let's build a web app! Gradio lets you create ML interfaces with just a few lines of code.

The function receives an image, resizes it, normalizes it, and passes it to the model. Fill in dimensions and normalization.

slika_resized = slika_pil.resize((_____, _____))  # <-- (IMG_WIDTH, IMG_HEIGHT)
slika_norm = np.array(slika_resized).astype('float32') / _____  # <-- normalization

Resources

What we learned & where to go next

Today we covered

AI/ML/DL concepts - epoch, batch, loss, accuracy
Real dataset - Fashion Product Images (3,866 images)
CNN architecture - Conv2D, BatchNorm, Pooling, Dropout
Evaluation - Confusion Matrix, Classification Report
AI limitations - the model can't say "I don't know"
Data Augmentation - rotation, shift, zoom, flip
Deploy - Gradio web application

Presentation

Download workshop slides

Download

Google Cloud Credits

Free credits for Google Cloud platform

Claim credits

Resources for further learning

Resource	Description	Link
TensorFlow	Official documentation	tensorflow.org
Keras	Neural networks API	keras.io
Google ML Crash Course	Free course by Google	ML Crash Course
Gradio	ML web applications	gradio.app
Kaggle	Datasets & competitions	kaggle.com
fast.ai	Free DL course	fast.ai