The 18% Problem
A survey of CS students found that 84% use AI tools daily, but only 18% feel ready to ship with them at work. The gap isn't access—it's that most courses teach either pure math without code or framework-only usage without understanding. Rohit G. wrote 435 lessons to fill that middle ground.
The Rule: Build It, Then Use It
Every algorithm gets two implementations:
- Build It: Implement with NumPy and stdlib. No frameworks. Code is slow but short enough to read in one sitting. You can print anything.
- Use It: Same algorithm with production tools (PyTorch, sklearn, tiktoken). Diff the outputs. The framework stops being a black box.
Concrete Example: Attention in 30 Lines
import numpy as np
def attention(Q, K, V, mask=None):
d_k = Q.shape[-1]
scores = Q @ K.swapaxes(-1, -2) / np.sqrt(d_k)
if mask is not None:
scores = np.where(mask, scores, -1e9)
weights = np.exp(scores - scores.max(axis=-1, keepdims=True))
weights /= weights.sum(axis=-1, keepdims=True)
return weights @ V
# Toy run: 4 tokens, 8-dim
rng = np.random.default_rng(0)
Q = rng.standard_normal((4, 8))
K = rng.standard_normal((4, 8))
V = rng.standard_normal((4, 8))
out = attention(Q, K, V)
print(out.shape) # (4, 8)
The second half runs the same example through torch.nn.MultiheadAttention and checks numerical precision. PyTorch becomes "your code, compiled for CUDA."
Gradient Descent Without the Framework
import numpy as np
def fit(X, y, lr=0.01, steps=1000):
w = np.zeros(X.shape[1])
b = 0.0
n = len(X)
for _ in range(steps):
pred = X @ w + b
err = pred - y
w -= lr * (X.T @ err) / n
b -= lr * err.sum() / n
return w, b
Six lines of math, three of which are the gradient. When re-run through scikit-learn or PyTorch's optim.SGD, the loss curves overlay. No more black boxes.
The Curriculum Structure
The 435 lessons span 20 phases, from math foundations to multi-agent systems. By Phase 10, you write a small LLM; by Phase 14, a working agent loop; by Phase 19, a multi-agent system. Every layer has a hand-built version under the framework version.
Four Reusable Artifacts
Each lesson produces one of:
- A prompt template for a specific task
- A skill spec that drops into Claude, Cursor, or Codex
- An agent definition with a clear job
- An MCP server that exposes the lesson's code as a tool
By Phase 19, you have hundreds of these—a toolbox for real tasks.
How to Start
Three ways, ordered by friction:
- Read in browser: aiengineeringfromscratch.com — no setup.
- Clone and run:
git clone https://github.com/rohitg00/ai-engineering-from-scratch.git cd ai-engineering-from-scratch python phases/01-math-foundations/01-linear-algebra-intuition/code/vectors.py - Install skills into your agent:
Then runnpx skills add rohitg00/ai-engineering-from-scratch/find-your-levelinside the agent. Ten questions, the agent picks a phase and gives an hour estimate.
What This Is Not
Not video lectures. Not copy-paste deploys. Not five-minute explainers. The lessons are dense, math is real, everything runs on a laptop. Backprop in Python, attention in TypeScript, toy GPU kernel in Rust, Bayesian sampler in Julia. If you only want to call an API, this will feel slow. If you want to know why the API works, this is the route.
Closing Argument
The curriculum is free and MIT-licensed. The rule—build the small version first, then use the framework—held the curriculum together for eighteen months without contradiction. If it helps, star the repo so others find it sooner. If something is missing, open an issue.






