Build with AI Bootcamp
From zero to a working AI web app, classifying fashion products using Convolutional Neural Networks (CNN).
Setup & Environment
We start by setting up Google Colab and enabling GPU. GPU is essential as it speeds up model training 10-100x compared to CPU.
Enable GPU
- In Google Colab go to Runtime → Change runtime type
- In Hardware accelerator select GPU (T4 is free)
- Click Save
GPU Check
# Check if GPU is available
import tensorflow as tf
print("TensorFlow version:", tf.__version__)
gpus = tf.config.list_physical_devices('GPU')
if gpus:
print(f"GPU available! Found {len(gpus)} device(s)")
else:
print("WARNING: GPU not found! Go to Runtime → Change runtime type → GPU")
Install & Import Libraries
# Installation
!pip install gradio gdown -q
# Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os, zipfile, gdown
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from sklearn.metrics import confusion_matrix, classification_report
from sklearn.model_selection import train_test_split
from PIL import Image
import warnings
warnings.filterwarnings('ignore')
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = [12, 8]
plt.rcParams['font.size'] = 12
print("All libraries loaded successfully!")
Download & Load Dataset
We use Fashion Product Images, real product photos. We work with 4 classes: T-shirt, Jeans, Sneakers, Jacket.
Download
FILE_ID = '1YpceXZvh-8idz-nKDJjUh3-nrlVvhyYY'
print("Downloading Fashion dataset...")
url = f'https://drive.google.com/uc?id={FILE_ID}'
output = 'fashion_dataset.zip'
gdown.download(url, output, quiet=False)
with zipfile.ZipFile(output, 'r') as zip_ref:
zip_ref.extractall('.')
os.remove(output)
print("Dataset downloaded and extracted!")
Locate & Load
def pronadji_dataset():
for folder in ['fashion_subset gdg', 'fashion_subset', 'fashion_dataset']:
if os.path.exists(folder):
if os.path.exists(os.path.join(folder, 'styles.csv')):
return folder
for item in os.listdir('.'):
if os.path.isdir(item) and os.path.exists(os.path.join(item, 'styles.csv')):
return item
return None
DATASET_PATH = pronadji_dataset()
STYLES_PATH = os.path.join(DATASET_PATH, 'styles.csv')
IMAGES_PATH = os.path.join(DATASET_PATH, 'images')
df = pd.read_csv(STYLES_PATH, on_bad_lines='skip')
print(f"Loaded {len(df):,} products")
Select Classes & Load Images
Same code as Serbian version - code is language-independent. Copy from the Serbian page or the notebook.
Preprocessing & Data Splitting
Before training we need to normalize the images and split data into training, validation and test sets.
Pixels have values 0-255. Neural networks work better with small numbers (0.0 - 1.0). What number should we divide by?
X = X.astype('float32') / _____ # <-- FILL IN: what number do we divide by?
Split data: 70% training, 15% validation, 15% test.
X_temp, X_test, y_temp, y_test = train_test_split(
X, y,
test_size=_____, # <-- FILL IN: what percentage for test?
random_state=42,
stratify=_____ # <-- FILL IN: stratify by what? (labels)
)
test_size=0.15 means 15%. stratify=y preserves class proportions.CNN Architecture
CNN (Convolutional Neural Network) is a type of neural network specialized for images. Our model has 3 convolutional blocks with increasing filters: 32 → 64 → 128.
| Layer | Function |
|---|---|
| Conv2D | Detects patterns (edges, textures, shapes) |
| BatchNormalization | Stabilizes and speeds up training |
| MaxPooling2D | Reduces dimensions, keeps important info |
| Dropout | Prevents overfitting (randomly disables neurons) |
| Dense | Fully connected layer for classification |
The most important part! Fill in the blanks in Block 1 and the last layer. Blocks 2 and 3 are given as reference.
# Block 1: fill in filters, kernel, activation, pooling, dropout
layers.Conv2D(_____, (_____, _____), activation='_____', padding='same', input_shape=INPUT_SHAPE),
layers.MaxPooling2D((_____, _____)),
layers.Dropout(_____),
# Last layer: how many neurons? which activation?
layers.Dense(_____, activation='_____')
Compilation & Training
Compilation tells the model how to learn: which algorithm, how to measure error, and what to track.
model.compile(
optimizer=keras.optimizers.Adam(learning_rate=_____), # <-- learning rate
loss='_____', # <-- loss function
metrics=['_____'] # <-- what to track?
)
learning_rate=0.001 | loss='sparse_categorical_crossentropy' | metrics=['accuracy']Choose number of epochs and batch size. EarlyStopping will halt training if the model stops improving.
EPOCHS = _____ # <-- number of epochs (recommended: 25)
BATCH_SIZE = _____ # <-- batch size (recommended: 32)
Model Evaluation
Time to see how good our model is on data it has never seen, the test set.
Code is the same as in the Serbian version. Copy from the notebook or the Serbian page.
AI Limitations - the model can't say "I don't know"
Data Augmentation
Augmentation artificially creates new images from existing ones: rotates, shifts, zooms, flips. The model sees "new" images every epoch, which helps generalization.
Fill in the parameters. Be careful, too aggressive augmentation can hurt results!
datagen = ImageDataGenerator(
rotation_range=_____, # <-- degrees (recommended: 10)
width_shift_range=_____, # <-- horizontal shift (recommended: 0.1)
height_shift_range=_____, # <-- vertical shift (recommended: 0.1)
zoom_range=_____, # <-- zoom (recommended: 0.1)
horizontal_flip=_____, # <-- True or False?
fill_mode='nearest'
)
Deploy with Gradio
Final step, let's build a web app! Gradio lets you create ML interfaces with just a few lines of code.
The function receives an image, resizes it, normalizes it, and passes it to the model. Fill in dimensions and normalization.
slika_resized = slika_pil.resize((_____, _____)) # <-- (IMG_WIDTH, IMG_HEIGHT)
slika_norm = np.array(slika_resized).astype('float32') / _____ # <-- normalization
What we learned & where to go next
Today we covered
- AI/ML/DL concepts - epoch, batch, loss, accuracy
- Real dataset - Fashion Product Images (3,866 images)
- CNN architecture - Conv2D, BatchNorm, Pooling, Dropout
- Evaluation - Confusion Matrix, Classification Report
- AI limitations - the model can't say "I don't know"
- Data Augmentation - rotation, shift, zoom, flip
- Deploy - Gradio web application
Presentation
Download workshop slides
Google Cloud Credits
Free credits for Google Cloud platform
Resources for further learning
| Resource | Description | Link |
|---|---|---|
| TensorFlow | Official documentation | tensorflow.org |
| Keras | Neural networks API | keras.io |
| Google ML Crash Course | Free course by Google | ML Crash Course |
| Gradio | ML web applications | gradio.app |
| Kaggle | Datasets & competitions | kaggle.com |
| fast.ai | Free DL course | fast.ai |