Workshop Guide
GDG Belgrade & VTŠ Apps Team SR
VTŠ Apps Team
GDG Belgrade
One-day workshop

Build with AI Bootcamp

From zero to a working AI web app, classifying fashion products using Convolutional Neural Networks (CNN).

April 25, 2026
FON, Belgrade
10:00 AM - 7:00 PM

Setup & Environment

We start by setting up Google Colab and enabling GPU. GPU is essential as it speeds up model training 10-100x compared to CPU.

Enable GPU

  1. In Google Colab go to Runtime → Change runtime type
  2. In Hardware accelerator select GPU (T4 is free)
  3. Click Save

GPU Check

Python
# Check if GPU is available
import tensorflow as tf

print("TensorFlow version:", tf.__version__)
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    print(f"GPU available! Found {len(gpus)} device(s)")
else:
    print("WARNING: GPU not found! Go to Runtime → Change runtime type → GPU")

Install & Import Libraries

Python
# Installation
!pip install gradio gdown -q

# Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os, zipfile, gdown

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator

from sklearn.metrics import confusion_matrix, classification_report
from sklearn.model_selection import train_test_split
from PIL import Image

import warnings
warnings.filterwarnings('ignore')

plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = [12, 8]
plt.rcParams['font.size'] = 12

print("All libraries loaded successfully!")

Download & Load Dataset

We use Fashion Product Images, real product photos. We work with 4 classes: T-shirt, Jeans, Sneakers, Jacket.

Download

Python
FILE_ID = '1YpceXZvh-8idz-nKDJjUh3-nrlVvhyYY'
print("Downloading Fashion dataset...")

url = f'https://drive.google.com/uc?id={FILE_ID}'
output = 'fashion_dataset.zip'
gdown.download(url, output, quiet=False)

with zipfile.ZipFile(output, 'r') as zip_ref:
    zip_ref.extractall('.')
os.remove(output)
print("Dataset downloaded and extracted!")

Locate & Load

Python
def pronadji_dataset():
    for folder in ['fashion_subset gdg', 'fashion_subset', 'fashion_dataset']:
        if os.path.exists(folder):
            if os.path.exists(os.path.join(folder, 'styles.csv')):
                return folder
    for item in os.listdir('.'):
        if os.path.isdir(item) and os.path.exists(os.path.join(item, 'styles.csv')):
            return item
    return None

DATASET_PATH = pronadji_dataset()
STYLES_PATH = os.path.join(DATASET_PATH, 'styles.csv')
IMAGES_PATH = os.path.join(DATASET_PATH, 'images')

df = pd.read_csv(STYLES_PATH, on_bad_lines='skip')
print(f"Loaded {len(df):,} products")

Select Classes & Load Images

Same code as Serbian version - code is language-independent. Copy from the Serbian page or the notebook.

Preprocessing & Data Splitting

Before training we need to normalize the images and split data into training, validation and test sets.

Task 1 Image Normalization

Pixels have values 0-255. Neural networks work better with small numbers (0.0 - 1.0). What number should we divide by?

Python - fill in
X = X.astype('float32') / _____  # <-- FILL IN: what number do we divide by?
Hint: What is the maximum pixel value? Dividing by it makes the max become 1.0.
Task 2 Data Splitting

Split data: 70% training, 15% validation, 15% test.

Python - fill in
X_temp, X_test, y_temp, y_test = train_test_split(
    X, y,
    test_size=_____,      # <-- FILL IN: what percentage for test?
    random_state=42,
    stratify=_____         # <-- FILL IN: stratify by what? (labels)
)
Hint: test_size=0.15 means 15%. stratify=y preserves class proportions.

CNN Architecture

CNN (Convolutional Neural Network) is a type of neural network specialized for images. Our model has 3 convolutional blocks with increasing filters: 32 → 64 → 128.

LayerFunction
Conv2DDetects patterns (edges, textures, shapes)
BatchNormalizationStabilizes and speeds up training
MaxPooling2DReduces dimensions, keeps important info
DropoutPrevents overfitting (randomly disables neurons)
DenseFully connected layer for classification
Task 3 Build the CNN Model

The most important part! Fill in the blanks in Block 1 and the last layer. Blocks 2 and 3 are given as reference.

Python - fill in
# Block 1: fill in filters, kernel, activation, pooling, dropout
layers.Conv2D(_____, (_____, _____), activation='_____', padding='same', input_shape=INPUT_SHAPE),
layers.MaxPooling2D((_____, _____)),
layers.Dropout(_____),

# Last layer: how many neurons? which activation?
layers.Dense(_____, activation='_____')
Hints: Block 1 has 32 filters, kernel 3x3, activation 'relu'. Pooling is 2x2. Dropout is 0.25. Last Dense has as many neurons as classes, with 'softmax' activation.

Compilation & Training

Task 4 Model Compilation

Compilation tells the model how to learn: which algorithm, how to measure error, and what to track.

Python - fill in
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=_____),  # <-- learning rate
    loss='_____',                                          # <-- loss function
    metrics=['_____']                                      # <-- what to track?
)
Hint: learning_rate=0.001 | loss='sparse_categorical_crossentropy' | metrics=['accuracy']
Task 5 Model Training

Choose number of epochs and batch size. EarlyStopping will halt training if the model stops improving.

Python - fill in
EPOCHS = _____       # <-- number of epochs (recommended: 25)
BATCH_SIZE = _____   # <-- batch size (recommended: 32)

Model Evaluation

Time to see how good our model is on data it has never seen, the test set.

Code is the same as in the Serbian version. Copy from the notebook or the Serbian page.

AI Limitations - the model can't say "I don't know"

Key lesson: The model ALWAYS picks one of the 4 classes, even for completely meaningless images. It cannot say "I don't know". This is a fundamental limitation of classification models.

Data Augmentation

Augmentation artificially creates new images from existing ones: rotates, shifts, zooms, flips. The model sees "new" images every epoch, which helps generalization.

Task 6 Define Augmentation

Fill in the parameters. Be careful, too aggressive augmentation can hurt results!

Python - fill in
datagen = ImageDataGenerator(
    rotation_range=_____,        # <-- degrees (recommended: 10)
    width_shift_range=_____,     # <-- horizontal shift (recommended: 0.1)
    height_shift_range=_____,    # <-- vertical shift (recommended: 0.1)
    zoom_range=_____,            # <-- zoom (recommended: 0.1)
    horizontal_flip=_____,       # <-- True or False?
    fill_mode='nearest'
)
Note: Augmentation isn't always better! If too aggressive or dataset too small, it can worsen results. That's a normal part of the ML process, we learn from mistakes too.

Deploy with Gradio

Final step, let's build a web app! Gradio lets you create ML interfaces with just a few lines of code.

Task 7 Classification Function

The function receives an image, resizes it, normalizes it, and passes it to the model. Fill in dimensions and normalization.

Python - fill in
slika_resized = slika_pil.resize((_____, _____))  # <-- (IMG_WIDTH, IMG_HEIGHT)
slika_norm = np.array(slika_resized).astype('float32') / _____  # <-- normalization

What we learned & where to go next

Today we covered

Presentation

Download workshop slides

Download

Google Cloud Credits

Free credits for Google Cloud platform

Claim credits

Resources for further learning

ResourceDescriptionLink
TensorFlowOfficial documentationtensorflow.org
KerasNeural networks APIkeras.io
Google ML Crash CourseFree course by GoogleML Crash Course
GradioML web applicationsgradio.app
KaggleDatasets & competitionskaggle.com
fast.aiFree DL coursefast.ai