📚 On This Page

Module 8: Deploy Your AI – Share It With The World!

Duration: Week 15-16
Difficulty: Intermediate
Prerequisites: Modules 1-7 completed

—

🎯 What You’ll Learn

Turn your trained models into real applications that anyone can use!

—

📦 Part 1: Model Export – Making Models Portable

The Problem

You trained in PyTorch, but want to run on:

Phones (Android/iOS)
Websites (JavaScript)
Other frameworks

Solution: ONNX (Open Neural Network Exchange)

Think of ONNX like a universal translator:

PyTorch Model → ONNX → Run anywhere!

Project: Export to ONNX

Your PyTorch model
model = YourModel()
Export (one line!)
torch.onnx.export(model, dummy_input, "model.onnx")
Now use ONNX Runtime (2-3x faster!)

—

🔢 Part 2: Quantization – Make Models Tiny & Fast

The Magic of Quantization

Before: 32-bit numbers (4 bytes each)
After: 8-bit numbers (1 byte each)
Result:

4x smaller file size
3-4x faster inference
<1% accuracy loss

Types of Quantization

1. Dynamic Quantization (Easiest)
- One line of code
- Instant speedup

2. Static Quantization (Better)
- Need calibration data
- Better accuracy

3. Quantization-Aware Training (Best)
- Train with quantization in mind
- Best accuracy

Project: Quantize Your Model

Take your trained classifier
Apply dynamic quantization
Benchmark: Size, speed, accuracy
Goal: 4x smaller, 3x faster!

---

🌐 Part 3: Building a Web API

What's an API?

A way for programs to talk to each other!

Example:

User uploads image → Your API → Model predicts → Return "Cat!"

FastAPI - Easy Python Web Framework

from fastapi import FastAPI, File
app = FastAPI()@app.post("/predict")
def predict(image: File):
    # Load image
    # Run model
    # Return prediction
    return {"class": "cat", "confidence": 0.95}

Project: Build Image Classifier API

Create FastAPI server
Load quantized model
Accept image uploads
Return predictions
Deploy locally

---

🎨 Part 4: Creating a User Interface

Option 1: Gradio (Easiest)

import gradio as gr
def classify(image):
    return "Cat: 95%"gr.Interface(fn=classify, 
             inputs="image", 
             outputs="text").launch()

3 lines → Beautiful web interface!

Option 2: Custom HTML/JavaScript

Build custom website
Upload images
Display results beautifully

Project: Create Web Interface

Use Gradio for quick demo
Build custom HTML page
Connect to FastAPI backend
Style with CSS

---

🐳 Part 5: Docker - Package Everything Together

What's Docker?

Put your entire app in a "container" that runs anywhere!

Analogy: Like a shipping container

Pack everything inside
Ship anywhere
Works the same everywhere

Dockerfile Example

FROM python:3.9
COPY model.onnx /app/
COPY api.py /app/
RUN pip install fastapi onnxruntime
CMD ["python", "api.py"]

Project: Dockerize Your App

Write Dockerfile
Build container
Run locally
Share with friends!

---

⚡ ISL Optimization - Fast Inference

1. Batch Inference

Instead of:
Predict image 1 → 100ms
Predict image 2 → 100ms
Total: 200msDo:
Predict [image 1, 2] together → 120ms
Total: 120ms (2x faster!)

2. Caching

If same image uploaded → Return cached result
No need to run model again!

3. Model Pruning

Find weights close to zero
Remove them
Retrain briefly
Result: 50% smaller!

4. CPU Optimizations

Use AVX2 (most CPUs have this)
Result: 2x faster on same hardware!

---

🌍 Real-World Deployment Options

Free Options

1. Hugging Face Spaces (Free hosting!)
- Upload Gradio app
- Get free URL

2. Google Colab (For demos)
- Run in notebook
- Free GPU!

3. Render / Railway (Free tier)
- Deploy FastAPI app
- Get public URL

---

📚 Resources

FastAPI documentation
Gradio documentation
ONNX Runtime guides
Docker tutorials
Hugging Face Spaces

---

✅ Learning Checklist

[ ] Export models to ONNX
[ ] Quantize models for 4x speedup
[ ] Build REST API with FastAPI
[ ] Create web interface with Gradio
[ ] Deploy with Docker
[ ] Optimize inference for production

---

🎉 Final Project Ideas

1. Image Classifier Website: Upload photo, get classification
2. Text Generator API: Send prompt, receive story
3. Object Detection App: Upload image, see boxes
4. Chatbot Interface: Simple conversational AI
5. Style Transfer Tool: Turn photos into paintings

---

🏆 Congratulations!

You've completed the AI Engineering Syllabus!

You now know how to:
✅ Build AI models from scratch
✅ Train them efficiently on 16GB RAM
✅ Deploy them as real applications
✅ Optimize for speed and memory

Keep learning, keep building, and share your projects with the world! 🚀