The 18% Problem

A survey of CS students found that 84% use AI tools daily, but only 18% feel ready to ship with them at work. The gap isn't access—it's that most courses teach either pure math without code or framework-only usage without understanding. Rohit G. wrote 435 lessons to fill that middle ground.

The Rule: Build It, Then Use It

Every algorithm gets two implementations:

  1. Build It: Implement with NumPy and stdlib. No frameworks. Code is slow but short enough to read in one sitting. You can print anything.
  2. Use It: Same algorithm with production tools (PyTorch, sklearn, tiktoken). Diff the outputs. The framework stops being a black box.

Concrete Example: Attention in 30 Lines

import numpy as np

def attention(Q, K, V, mask=None):
    d_k = Q.shape[-1]
    scores = Q @ K.swapaxes(-1, -2) / np.sqrt(d_k)
    if mask is not None:
        scores = np.where(mask, scores, -1e9)
    weights = np.exp(scores - scores.max(axis=-1, keepdims=True))
    weights /= weights.sum(axis=-1, keepdims=True)
    return weights @ V

# Toy run: 4 tokens, 8-dim
rng = np.random.default_rng(0)
Q = rng.standard_normal((4, 8))
K = rng.standard_normal((4, 8))
V = rng.standard_normal((4, 8))
out = attention(Q, K, V)
print(out.shape)  # (4, 8)

The second half runs the same example through torch.nn.MultiheadAttention and checks numerical precision. PyTorch becomes "your code, compiled for CUDA."

Gradient Descent Without the Framework

import numpy as np

def fit(X, y, lr=0.01, steps=1000):
    w = np.zeros(X.shape[1])
    b = 0.0
    n = len(X)
    for _ in range(steps):
        pred = X @ w + b
        err = pred - y
        w -= lr * (X.T @ err) / n
        b -= lr * err.sum() / n
    return w, b

Six lines of math, three of which are the gradient. When re-run through scikit-learn or PyTorch's optim.SGD, the loss curves overlay. No more black boxes.

The Curriculum Structure

The 435 lessons span 20 phases, from math foundations to multi-agent systems. By Phase 10, you write a small LLM; by Phase 14, a working agent loop; by Phase 19, a multi-agent system. Every layer has a hand-built version under the framework version.

Four Reusable Artifacts

Each lesson produces one of:

  • A prompt template for a specific task
  • A skill spec that drops into Claude, Cursor, or Codex
  • An agent definition with a clear job
  • An MCP server that exposes the lesson's code as a tool

By Phase 19, you have hundreds of these—a toolbox for real tasks.

How to Start

Three ways, ordered by friction:

  1. Read in browser: aiengineeringfromscratch.com — no setup.
  2. Clone and run:
    git clone https://github.com/rohitg00/ai-engineering-from-scratch.git
    cd ai-engineering-from-scratch
    python phases/01-math-foundations/01-linear-algebra-intuition/code/vectors.py
    
  3. Install skills into your agent:
    npx skills add rohitg00/ai-engineering-from-scratch
    
    Then run /find-your-level inside the agent. Ten questions, the agent picks a phase and gives an hour estimate.

What This Is Not

Not video lectures. Not copy-paste deploys. Not five-minute explainers. The lessons are dense, math is real, everything runs on a laptop. Backprop in Python, attention in TypeScript, toy GPU kernel in Rust, Bayesian sampler in Julia. If you only want to call an API, this will feel slow. If you want to know why the API works, this is the route.

Closing Argument

The curriculum is free and MIT-licensed. The rule—build the small version first, then use the framework—held the curriculum together for eighteen months without contradiction. If it helps, star the repo so others find it sooner. If something is missing, open an issue.