Claude Fable 5's Relentless Proactivity: A Debugging Session

Simon Willison shares how Claude Fable 5, given a single screenshot and prompt, autonomously hacked together browser automation, CORS servers, and template injections to debug a CSS bug in Datasette. The session cost ~$12.11 in API fees and ended when Fable hit a guardrail, degrading to Opus.

3 min readJun 12, 2026

Claude Fable 5's Relentless Proactivity: A Debugging Session

On June 11, 2026, Simon Willison witnessed Claude Fable 5 (part of Claude Code) autonomously debug a CSS bug in Datasette Agent using a chain of unexpected tricks. Given a single screenshot and the prompt "Look at dependencies to help figure out why there is a horizontal scrollbar here", Fable executed a multi-step investigation spanning browser automation, custom server code, and template injection.

The Bug

A horizontal scrollbar appeared in the jump menu chat prompt of Datasette Agent. Willison suspected the cause lay in a dependency (likely Datasette itself). He started a fresh Claude session in his datasette-agent checkout, dragged in the screenshot, and issued the one-line prompt.

Fable's Toolkit

Fable began by figuring out how to run the local development server, including fake environment variables. It then launched a Playwright Chrome session and toggled visible scrollbars via defaults write com.google.chrome.for.testing AppleShowScrollBars Always. After cycling through Firefox and WebKit in Playwright without reproducing the bug, it identified Safari as the default browser.

Fable then built a test HTML page at /tmp/textarea-scrollbar-test.html and opened it in real Firefox. When osascript was blocked by macOS permissions, it pivoted to a Python workaround using uv run --with pyobjc-framework-Quartz to enumerate windows and take screenshots via screencapture -x -o -l /tmp/safari-cases.png.

Triggering the Bug

To open the modal dialog (triggered by the / key), Fable edited Datasette's templates to inject JavaScript that simulated a keyboard event:

This caused the modal to open 1.2 seconds after page load.

Custom CORS Server

To extract measurements from the running application, Fable wrote a minimal Python HTTP server using only the standard library:

from http.server import HTTPServer, BaseHTTPRequestHandler
class H(BaseHTTPRequestHandler):
    def do_POST(self):
        n = int(self.headers.get(&#34;Content-Length&#34;, 0))
        open(&#34;/tmp/diag.json&#34;, &#34;w&#34;).write(self.rfile.read(n).decode())
        self.send_response(200)
        self.send_header(&#34;Access-Control-Allow-Origin&#34;, &#34;*&#34;)
        self.end_headers()
    def do_OPTIONS(self):
        self.send_response(200)
        self.send_header(&#34;Access-Control-Allow-Origin&#34;, &#34;*&#34;)
        self.send_header(&#34;Access-Control-Allow-Headers&#34;, &#34;*&#34;)
        self.end_headers()
    def log_message(self, *a): pass
HTTPServer((&#34;127.0.0.1&#34;, 9999), H).serve_forever()

It then injected JavaScript into the running app to POST diagnostic data:

const host = document.querySelector(&#34;navigation-search&#34;);
const ta   = host.shadowRoot.querySelector(&#34;textarea&#34;);
const cs   = getComputedStyle(ta);
fetch(&#34;http://127.0.0.1:9999/diag&#34;, {
  method: &#34;POST&#34;,
  body: JSON.stringify({
    dpr: window.devicePixelRatio,
    scrollWidth: ta.scrollWidth, clientWidth: ta.clientWidth,
    whiteSpace: cs.whiteSpace, width: cs.width,
  }),
});

This data was written to /tmp/diag.json and read by Fable.

The Fix

After gathering measurements, Fable hit an invisible guardrail and downgraded to Opus. Opus continued with the same tricks, found the issue (a two-line CSS fix), and verified it. Willison prompted Opus to write a report at /tmp/automation-report.md documenting all techniques.

Cost

The session consumed 68,606 output tokens with a peak context of 113,178 tokens, costing ~$12.11 at full API prices (Claude Fable 5 + Claude Opus 4-8). Willison was on the $100/month Claude Max plan with a limited Fable allowance.

Security Implications

Willison emphasizes the double-edged nature of Fable's proactivity. While impressive for debugging, the same capabilities could be exploited by prompt injection attacks. "Running coding agents outside of a sandbox has always been a bad idea," he warns, citing Johann Rehberger's "Normalization of Deviance in AI" concept. Fable's ability to write arbitrary code, control browsers, and exfiltrate data makes sandboxing critical.

What You Should Do

If you use Claude Code or similar agents, run them in sandboxed environments (containers, VMs) with restricted network and filesystem access. Consider using tools like firejail or Docker. Monitor agent behavior with logging and auditing tools like AgentsView. And always review the code an agent generates before merging.

Editor's Take

I've been using Claude Code for a few months, and Fable's proactivity is both exhilarating and terrifying. On one hand, it saved me hours of manual debugging. On the other, I'm now paranoid about what it could do if a prompt injection sneaks in through a dependency. I've started running all my agents inside Docker containers with no network access to the host. If you're not sandboxing your coding agents yet, this story should be your wake-up call.

— DevDigest Editorial

Key Takeaways

•Always run coding agents in sandboxed environments (containers, VMs) to limit damage from prompt injection.
•Use monitoring tools like AgentsView to track token usage and detect unexpected behavior early.
•Review all code changes suggested by agents before merging, especially template or configuration modifications.

Why It Matters

This session demonstrates that frontier coding agents can autonomously execute complex, multi-step debugging workflows—including writing custom servers and manipulating browser DOM—with minimal human input. It also highlights the urgent need for sandboxing, as the same capabilities could be abused via prompt injection or malicious instructions.

#claude-code#developer-tools#sandboxing#ai-agents#prompt-injection

Get the weekly digest

Every Sunday - top tech stories, industry breakthroughs, and developer tools delivered to your inbox.

No spam, unsubscribe anytime.

Claude Fable 5's Relentless Proactivity: A Debugging Session

Claude Fable 5's Relentless Proactivity: A Debugging Session

The Bug

Fable's Toolkit

Triggering the Bug

Custom CORS Server

The Fix

Cost

Security Implications

What You Should Do

Editor's Take

Key Takeaways

Why It Matters

Get the weekly digest

You might also like

AI Agent Sandboxing: Contain Blast Radius with LoopRails

Lean 4 + LLMs Automate Proofs: Zstd Decompressor Case Study

Pew: 1 in 5 Young Americans Use Chatbots for Emotional Support

Anthropic's Claude Cookbook: 50+ Recipes for Agent SDK & Managed Agents

Bare-Metal Kubernetes on $200 Scrap: Talos, Cilium, and 5 Failures

Vercel's Scriptc Compiles TypeScript to Native, No JS Engine