AI Engineering From Scratch: 435 Lessons Built Without Frame

AI Engineering From Scratch: 435 Lessons Built Without Frameworks

Rohit G. wrote 435 AI engineering lessons that teach algorithms by first implementing them from scratch with NumPy and stdlib, then comparing to production frameworks. The free, MIT-licensed curriculum covers tokenizers, attention, gradient descent, and multi-agent systems, aiming to bridge the gap between using AI tools and understanding them.

3 min readMay 25, 2026

AI Engineering From Scratch: 435 Lessons Built Without Frameworks

The 18% Problem

A survey of CS students found that 84% use AI tools daily, but only 18% feel ready to ship with them at work. The gap isn't access—it's that most courses teach either pure math without code or framework-only usage without understanding. Rohit G. wrote 435 lessons to fill that middle ground.

The Rule: Build It, Then Use It

Every algorithm gets two implementations:

Build It: Implement with NumPy and stdlib. No frameworks. Code is slow but short enough to read in one sitting. You can print anything.
Use It: Same algorithm with production tools (PyTorch, sklearn, tiktoken). Diff the outputs. The framework stops being a black box.

Concrete Example: Attention in 30 Lines

import numpy as np

def attention(Q, K, V, mask=None):
    d_k = Q.shape[-1]
    scores = Q @ K.swapaxes(-1, -2) / np.sqrt(d_k)
    if mask is not None:
        scores = np.where(mask, scores, -1e9)
    weights = np.exp(scores - scores.max(axis=-1, keepdims=True))
    weights /= weights.sum(axis=-1, keepdims=True)
    return weights @ V

# Toy run: 4 tokens, 8-dim
rng = np.random.default_rng(0)
Q = rng.standard_normal((4, 8))
K = rng.standard_normal((4, 8))
V = rng.standard_normal((4, 8))
out = attention(Q, K, V)
print(out.shape)  # (4, 8)

The second half runs the same example through torch.nn.MultiheadAttention and checks numerical precision. PyTorch becomes "your code, compiled for CUDA."

Gradient Descent Without the Framework

import numpy as np

def fit(X, y, lr=0.01, steps=1000):
    w = np.zeros(X.shape[1])
    b = 0.0
    n = len(X)
    for _ in range(steps):
        pred = X @ w + b
        err = pred - y
        w -= lr * (X.T @ err) / n
        b -= lr * err.sum() / n
    return w, b

Six lines of math, three of which are the gradient. When re-run through scikit-learn or PyTorch's optim.SGD, the loss curves overlay. No more black boxes.

The Curriculum Structure

The 435 lessons span 20 phases, from math foundations to multi-agent systems. By Phase 10, you write a small LLM; by Phase 14, a working agent loop; by Phase 19, a multi-agent system. Every layer has a hand-built version under the framework version.

Four Reusable Artifacts

Each lesson produces one of:

A prompt template for a specific task
A skill spec that drops into Claude, Cursor, or Codex
An agent definition with a clear job
An MCP server that exposes the lesson's code as a tool

By Phase 19, you have hundreds of these—a toolbox for real tasks.

How to Start

Three ways, ordered by friction:

Read in browser: aiengineeringfromscratch.com — no setup.

Clone and run:

git clone https://github.com/rohitg00/ai-engineering-from-scratch.git
cd ai-engineering-from-scratch
python phases/01-math-foundations/01-linear-algebra-intuition/code/vectors.py

Install skills into your agent:
```
npx skills add rohitg00/ai-engineering-from-scratch
```
Then run /find-your-level inside the agent. Ten questions, the agent picks a phase and gives an hour estimate.

What This Is Not

Not video lectures. Not copy-paste deploys. Not five-minute explainers. The lessons are dense, math is real, everything runs on a laptop. Backprop in Python, attention in TypeScript, toy GPU kernel in Rust, Bayesian sampler in Julia. If you only want to call an API, this will feel slow. If you want to know why the API works, this is the route.

Closing Argument

The curriculum is free and MIT-licensed. The rule—build the small version first, then use the framework—held the curriculum together for eighteen months without contradiction. If it helps, star the repo so others find it sooner. If something is missing, open an issue.

Editor's Take

I've spent years debugging production AI pipelines where the issue was a subtle framework behavior no one in the team fully understood. This 'build it then use it' approach would have saved us weeks. I particularly like that the lessons produce reusable artifacts like prompt templates and agent definitions—not just code snippets. My only concern is whether the curriculum keeps up with rapidly changing frameworks, but the foundational knowledge should age well.

— DevDigest Editorial

Key Takeaways

•Implement algorithms from scratch with NumPy/stdlib before using production frameworks to understand the internals.
•Diff the outputs of your hand-built version against the framework version to demystify black boxes.
•Use the curriculum's skill specs to quickly integrate lessons into Claude, Cursor, or Codex for everyday tasks.

Why It Matters

Most developers use AI tools without understanding the internals, leading to debugging nightmares when things break. This curriculum offers a practical middle path: implement the algorithm from scratch, then compare to production libraries. It's a free, structured way to move from user to builder.

#machine-learning#open-source#education#deep-learning#ai-engineering

Get the weekly digest

Every Sunday - top tech stories, industry breakthroughs, and developer tools delivered to your inbox.

No spam, unsubscribe anytime.

AI Engineering From Scratch: 435 Lessons Built Without Frameworks

The 18% Problem

The Rule: Build It, Then Use It

Concrete Example: Attention in 30 Lines

Gradient Descent Without the Framework

The Curriculum Structure

Four Reusable Artifacts

How to Start

What This Is Not

Closing Argument

Editor's Take

Key Takeaways

Why It Matters

Get the weekly digest

You might also like

Databricks benchmarks coding agents on multi-million line codebase

Grok 4.5 vs GPT-5.5 vs Claude: Build-Off Results and Benchmarks

Fleek Raises $25M to Scale AI Sorting for Secondhand Fashion

Perplexity to Run Agent Workloads on Nvidia's Vera CPU

Next.js 16 Optimistic UI: The Rapid-Click Bug That Breaks Your Toggle

Rate Limiting by IP Broke My API: Fixing Shared Provider Quotas