📚 On This Page

Module 2: Your First AI Models – Learning from Data

Duration: Week 3-4
Difficulty: Beginner
Prerequisites: Module 1 completed

—

🎯 What You’ll Learn

Remember learning to ride a bike? You fell a few times, adjusted, and got better. That’s exactly how these AI models learn! They make predictions, see their mistakes, and improve.

In this module, you’ll build:

Linear Regression: Predict numbers (house prices, temperatures)
Logistic Regression: Make yes/no decisions (spam or not spam?)
Decision Trees: Ask smart questions to classify things
K-Means Clustering: Find natural groups in data

—

📊 Core Concepts

1. Linear Regression – Drawing the Best Line

Problem: Predict house prices based on size

How it works: Find the line that best fits the data points

The Math (Simple!):

y = mx + bWhere:
y = price (what we predict)
x = size (what we know)
m = slope (how much price changes per square foot)
b = intercept (base price)

Visual:

Price
  |     *
  |      
  |       *
  |___________
      Size

You’ll code it: Using only basic math, no magic!

—

2. Logistic Regression – Yes or No Decisions

Problem: Is this email spam or not?

How it works: Calculate probability (0% to 100% chance)

The Sigmoid Function:

def sigmoid(x):
    return 1 / (1 + np.exp(-x))
Output is always between 0 and 1!
0.5 = 50% chance
0.9 = 90% chance it's spam

Real use: Gmail’s spam filter uses this!

—

3. Decision Trees – Playing 20 Questions

Problem: Classify animals

How it works: Ask yes/no questions to narrow down

Example Tree:

Is it a mammal?
├─ Yes: Does it fly?
│  ├─ Yes: Bat
│  └─ No: Is it big?
│     ├─ Yes: Elephant
│     └─ No: Cat
└─ No: Does it fly?
   ├─ Yes: Bird
   └─ No: Fish

Visual: You’ll draw actual tree diagrams!

—

4. K-Means Clustering – Grouping Similar Things

Problem: Group customers by shopping habits

How it works:
1. Pick K random centers
2. Assign each point to nearest center
3. Move centers to average of their points
4. Repeat until centers stop moving

Like: Organizing your music into playlists automatically

—

🛠️ Hands-on Projects

Project 1: House Price Predictor

Dataset: Real house data (size, bedrooms → price)

Steps:
1. Load data (CSV file)
2. Visualize (scatter plot)
3. Implement gradient descent
4. Train model
5. Make predictions!

Code Skeleton:

import numpy as np
import matplotlib.pyplot as plt
class LinearRegression:
    def __init__(self):
        self.m = 0  # slope
        self.b = 0  # intercept
    
    def fit(self, X, y, learning_rate=0.01, epochs=1000):
        n = len(X)
        for _ in range(epochs):
            # Predict
            y_pred = self.m * X + self.b
            
            # Calculate error
            error = y_pred - y
            
            # Update parameters (gradient descent!)
            self.m -= learning_rate  (2/n)  np.sum(error * X)
            self.b -= learning_rate  (2/n)  np.sum(error)
    
    def predict(self, X):
        return self.m * X + self.b
Use it!
model = LinearRegression()
model.fit(house_sizes, house_prices)
predicted_price = model.predict(2000)  # Predict for 2000 sq ft

Expected Result: See your model’s predictions get better over time!

—

Project 2: Spam Email Detector

Dataset: 1000s of emails labeled spam/not spam

Approach:
1. Convert emails to numbers (word counts)
2. Train logistic regression
3. Test on new emails

Features You’ll Extract:

Number of exclamation marks!!!
Presence of words like “FREE”, “WIN”, “CLICK”
Email length
Number of links

Accuracy Goal: 85%+

Test: On your own emails!

—

Project 3: Handwritten Digit Recognizer

Dataset: MNIST (70,000 images of digits 0-9)

Your First Image AI!

Approach:

Each image is 28×28 pixels = 784 numbers
Use decision tree to classify
Visualize which pixels matter most

Cool Factor: Draw a digit in Paint, your AI guesses it!

Code to Load MNIST:

from sklearn.datasets import fetch_openml
Load data
mnist = fetch_openml('mnist_784', version=1)
X, y = mnist.data, mnist.target
X shape: (70000, 784) - 70k images, 784 pixels each
y shape: (70000,) - labels 0-9

—

📐 The Math Behind It

Gradient Descent – How AI “Learns”

Think of it like finding the lowest point in a valley while blindfolded:

Steps:
1. Check which direction is downhill (calculate gradient)
2. Take a small step that direction (update weights)
3. Repeat until you reach the bottom (minimum error)

Visual:

Error
  |    *
  |   / \
  |  /   \
  | /     \___  ← We want to get here!
  |____________
     Weights

The Formula:

new_weight = old_weight - learning_rate * gradient

learning_rate: How big are our steps? (usually 0.01) gradient: Which direction to go?

We’ll visualize this with animations so you SEE the learning happen!

—

⚡ ISL Optimization – Working Smart

Problem: Dataset has 1 million rows, won’t fit in RAM

Solution: Mini-batch learning (process 100 rows at a time)

Code:

batch_size = 100
for epoch in range(num_epochs):
    for i in range(0, len(X), batch_size):
        X_batch = X[i:i+batch_size]
        y_batch = y[i:i+batch_size]
        
        # Train on this batch
        model.update(X_batch, y_batch)

Result: Same accuracy, 10x less memory!

—

Other Tricks You’ll Learn

1. Streaming Data (Process one piece at a time)

for row in data_stream:
    model.partial_fit(row)  # Update incrementally

2. Sparse Matrices (Only store non-zero numbers)

from scipy.sparse import csr_matrix
Dense: [0, 0, 5, 0, 0, 3, 0, 0] - stores 8 numbers
Sparse: {2: 5, 5: 3} - stores only 2 numbers!

3. Memory-Mapped Files (Use hard drive like RAM)

import numpy as np
Load huge file without loading into RAM
data = np.memmap('huge_data.npy', dtype='float32', mode='r')

—

📚 Free Datasets You’ll Use

UCI Machine Learning Repository: 100+ datasets

– Housing Data
– Iris Flowers

Kaggle: Competitions with real data

– Titanic Survival
– House Prices

MNIST Digits: Classic beginner dataset

– 70,000 handwritten digits
– Perfect for learning!

—

🎓 By the End, You’ll Understand

✅ Why AI needs lots of data (more examples = better learning)
✅ How to measure if your model is good (accuracy, precision, recall)
✅ Why simple models often work better than complex ones
✅ How to make models work on your laptop, not just supercomputers
✅ The math behind gradient descent (and why it works!)

—

📖 Resources

Videos

Reading

Scikit-learn Documentation (for reference)
Machine Learning Crash Course

Practice

Kaggle Learn – Free courses
UCI ML Repository – Datasets

—

✅ Learning Checklist

[ ] Implement linear regression from scratch
[ ] Understand gradient descent visually
[ ] Build logistic regression for classification
[ ] Create decision tree classifier
[ ] Implement K-means clustering
[ ] Use mini-batch learning for large datasets
[ ] Evaluate models with accuracy/precision metrics
[ ] Visualize model predictions

—

🎯 Quiz Yourself

1. What’s the difference between regression and classification?
2. How does gradient descent find the minimum?
3. Why do we use mini-batches instead of full dataset?
4. What does the sigmoid function do?
5. How does a decision tree make decisions?

—