π On This Page
Module 2: Your First AI Models – Learning from Data
Duration: Week 3-4
Difficulty: Beginner
Prerequisites: Module 1 completed
—
π― What You’ll Learn
Remember learning to ride a bike? You fell a few times, adjusted, and got better. That’s exactly how these AI models learn! They make predictions, see their mistakes, and improve.
In this module, you’ll build:
- Linear Regression: Predict numbers (house prices, temperatures)
- Logistic Regression: Make yes/no decisions (spam or not spam?)
- Decision Trees: Ask smart questions to classify things
- K-Means Clustering: Find natural groups in data
—
π Core Concepts
1. Linear Regression – Drawing the Best Line
Problem: Predict house prices based on size
How it works: Find the line that best fits the data points
The Math (Simple!):
y = mx + bWhere:
y = price (what we predict)
x = size (what we know)
m = slope (how much price changes per square foot)
b = intercept (base price)
Visual:
Price
| *
|
| *
|___________
Size
You’ll code it: Using only basic math, no magic!
—
2. Logistic Regression – Yes or No Decisions
Problem: Is this email spam or not?
How it works: Calculate probability (0% to 100% chance)
The Sigmoid Function:
def sigmoid(x):
return 1 / (1 + np.exp(-x))Output is always between 0 and 1!
0.5 = 50% chance
0.9 = 90% chance it's spam
Real use: Gmail’s spam filter uses this!
—
3. Decision Trees – Playing 20 Questions
Problem: Classify animals
How it works: Ask yes/no questions to narrow down
Example Tree:
Is it a mammal?
ββ Yes: Does it fly?
β ββ Yes: Bat
β ββ No: Is it big?
β ββ Yes: Elephant
β ββ No: Cat
ββ No: Does it fly?
ββ Yes: Bird
ββ No: Fish
Visual: You’ll draw actual tree diagrams!
—
4. K-Means Clustering – Grouping Similar Things
Problem: Group customers by shopping habits
How it works:
1. Pick K random centers
2. Assign each point to nearest center
3. Move centers to average of their points
4. Repeat until centers stop moving
Like: Organizing your music into playlists automatically
—
π οΈ Hands-on Projects
Project 1: House Price Predictor
Dataset: Real house data (size, bedrooms β price)
Steps:
1. Load data (CSV file)
2. Visualize (scatter plot)
3. Implement gradient descent
4. Train model
5. Make predictions!
Code Skeleton:
import numpy as np
import matplotlib.pyplot as pltclass LinearRegression:
def __init__(self):
self.m = 0 # slope
self.b = 0 # intercept
def fit(self, X, y, learning_rate=0.01, epochs=1000):
n = len(X)
for _ in range(epochs):
# Predict
y_pred = self.m * X + self.b
# Calculate error
error = y_pred - y
# Update parameters (gradient descent!)
self.m -= learning_rate (2/n) np.sum(error * X)
self.b -= learning_rate (2/n) np.sum(error)
def predict(self, X):
return self.m * X + self.b
Use it!
model = LinearRegression()
model.fit(house_sizes, house_prices)
predicted_price = model.predict(2000) # Predict for 2000 sq ft
Expected Result: See your model’s predictions get better over time!
—
Project 2: Spam Email Detector
Dataset: 1000s of emails labeled spam/not spam
Approach:
1. Convert emails to numbers (word counts)
2. Train logistic regression
3. Test on new emails
Features You’ll Extract:
- Number of exclamation marks!!!
- Presence of words like “FREE”, “WIN”, “CLICK”
- Email length
- Number of links
Accuracy Goal: 85%+
Test: On your own emails!
—
Project 3: Handwritten Digit Recognizer
Dataset: MNIST (70,000 images of digits 0-9)
Your First Image AI!
Approach:
- Each image is 28×28 pixels = 784 numbers
- Use decision tree to classify
- Visualize which pixels matter most
Cool Factor: Draw a digit in Paint, your AI guesses it!
Code to Load MNIST:
from sklearn.datasets import fetch_openmlLoad data
mnist = fetch_openml('mnist_784', version=1)
X, y = mnist.data, mnist.targetX shape: (70000, 784) - 70k images, 784 pixels each
y shape: (70000,) - labels 0-9
—
π The Math Behind It
Gradient Descent – How AI “Learns”
Think of it like finding the lowest point in a valley while blindfolded:
Steps:
1. Check which direction is downhill (calculate gradient)
2. Take a small step that direction (update weights)
3. Repeat until you reach the bottom (minimum error)
Visual:
Error
| *
| / \
| / \
| / \___ β We want to get here!
|____________
Weights
The Formula:
new_weight = old_weight - learning_rate * gradientlearning_rate: How big are our steps? (usually 0.01)
gradient: Which direction to go?
We’ll visualize this with animations so you SEE the learning happen!
—
β‘ ISL Optimization – Working Smart
Problem: Dataset has 1 million rows, won’t fit in RAM
Solution: Mini-batch learning (process 100 rows at a time)
Code:
batch_size = 100
for epoch in range(num_epochs):
for i in range(0, len(X), batch_size):
X_batch = X[i:i+batch_size]
y_batch = y[i:i+batch_size]
# Train on this batch
model.update(X_batch, y_batch)
Result: Same accuracy, 10x less memory!
—
Other Tricks You’ll Learn
1. Streaming Data (Process one piece at a time)
for row in data_stream:
model.partial_fit(row) # Update incrementally
2. Sparse Matrices (Only store non-zero numbers)
from scipy.sparse import csr_matrixDense: [0, 0, 5, 0, 0, 3, 0, 0] - stores 8 numbers
Sparse: {2: 5, 5: 3} - stores only 2 numbers!
3. Memory-Mapped Files (Use hard drive like RAM)
import numpy as npLoad huge file without loading into RAM
data = np.memmap('huge_data.npy', dtype='float32', mode='r')
—
π Free Datasets You’ll Use
- UCI Machine Learning Repository: 100+ datasets
- Kaggle: Competitions with real data
– Titanic Survival
– House Prices
- MNIST Digits: Classic beginner dataset
– 70,000 handwritten digits
– Perfect for learning!
—
π By the End, You’ll Understand
β
Why AI needs lots of data (more examples = better learning)
β
How to measure if your model is good (accuracy, precision, recall)
β
Why simple models often work better than complex ones
β
How to make models work on your laptop, not just supercomputers
β
The math behind gradient descent (and why it works!)
—
π Resources
Videos
Reading
- Scikit-learn Documentation (for reference)
- Machine Learning Crash Course
Practice
- Kaggle Learn – Free courses
- UCI ML Repository – Datasets
—
β Learning Checklist
- [ ] Implement linear regression from scratch
- [ ] Understand gradient descent visually
- [ ] Build logistic regression for classification
- [ ] Create decision tree classifier
- [ ] Implement K-means clustering
- [ ] Use mini-batch learning for large datasets
- [ ] Evaluate models with accuracy/precision metrics
- [ ] Visualize model predictions
—
π― Quiz Yourself
1. What’s the difference between regression and classification?
2. How does gradient descent find the minimum?
3. Why do we use mini-batches instead of full dataset?
4. What does the sigmoid function do?
5. How does a decision tree make decisions?
—
π Next Steps
Ready for neural networks? Move on to:
Module 3: Neural Networks – Teaching Computers to Think
You’ll build your own neural network library from scratch!
π References & Further Reading
Dive deeper with these carefully selected resources:
-
π Scikit-learn Documentation
by Scikit-learn Team
-
π Introduction to Statistical Learning
by James, Witten, Hastie, Tibshirani
-
π Andrew Ng – Machine Learning Course
by Andrew Ng
π Related Topics
-
β
Linear Regression: The Foundation of ML -
β
Decision Trees vs Random Forests -
β
K-Means Clustering Explained