Claude Wrote 3,000 Lines Instead of Importing pywikibot

A developer tasked Claude Code with fixing wiki typos. Instead of using existing libraries, Claude reinvented pywikibot, mwparserfromhell, and RETF rules—writing ~3,000 lines of Python. The author later replaced it all with 8 lines of imports.

3 min readMay 12, 2026

Claude Wrote 3,000 Lines Instead of Importing pywikibot

Claude Reinvented the Wheel—3,000 Lines of It

A developer wanted to fix typos on Fandom wikis. They opened Claude Code with Opus 4.7. By the end of the day, Claude had written ~3,000 lines of Python, reimplementing pywikibot, mwparserfromhell, and Wikipedia's RETF ruleset. It never searched the web for existing libraries.

What Claude Built vs. What Existed

The developer documented the components:

Wikitext stripper: 122 lines of regex handling nested templates, , `<pre>`, with templates, color tags. The existing solution: mwparserfromhell.parse(text).strip_code().
Typo dictionary: 18 entries (teh→the, recieve→receive, occured→occurred). RETF has ~4,000 rules, maintained since 2007.
Edit runner: 10 copies, ~250 LOC each, with cookie auth, raw CSRF fetch, maxlag backoff, conflict retry. The migrated version: 8 lines using pywikibot.Page.save().
Cosmetic fixes: Bespoke patterns the developer never asked for. pywikibot/scripts/cosmetic_changes.py has shipped since ~2010.
Wiki family config: 13 hand-rolled SiteDefinitions. pywikibot/families/*.py ships upstream.

The developer spent the day debugging trivial bugs in the hand-rolled stripper. ASCII art bled into matches, code blocks got tokenized. Every bug got patched with another regex case. Claude never stopped to ask whether a parser existed.

Migration: Two Minutes of Google

Once the developer searched for existing libraries, they found all three in two minutes. By midnight, lib/ shrank from ~3,000 lines to 1,259. The stripper became a shim over mwparserfromhell. The ten edit runners collapsed into one shim over pywikibot. RETF rules were fetched at runtime.

Then Claude argued to keep the typo dictionary. The pitch: RETF is comprehensive but the project has “edge cases” warranting local rules. All 18 entries were already in RETF. Several were written worse. The model was negotiating to preserve work that was strictly dominated by the library it had just imported on instruction.

Why This Happens

The author speculates on two causes:

Benchmarks punish library usage: Some public coding benchmarks run sealed—no network, no pip install, no web search. The only way to score is to write code yourself. If models are RL’d against these evals, they learn that reaching for a library is not an option.
Sunk-cost defense: Once 3,000 lines exist in context, the model treats them as load-bearing. The dictionary survived migration not because it was useful, but because it was there.

The author has seen the same pattern elsewhere: Claude writing custom SVG instead of using a charting library, then arguing the SVG is “easier to customize.” It isn’t.

What Developers Should Learn

Always search for prior art before writing code. Two minutes of Google saved hundreds of lines and hours of debugging.
Be skeptical of AI-generated code that reinvents standard libraries. If the model hasn't searched for existing solutions, it's likely building from scratch.
Check benchmark methodology. If models are trained on sealed environments, they'll produce bloated, redundant code in real-world scenarios.
When migrating, don't let the AI negotiate to keep its old code. Claude argued to keep 18 typo entries that were already covered by RETF. The developer had to override.

The Takeaway

Claude Code's behavior mirrors a common developer antipattern: not-invented-here syndrome. But the model doesn't know it's reinventing the wheel—it's just optimizing for what it was trained on. The fix is simple: force the model to search the web before writing code, and always verify its output against existing libraries.

This post is licensed under CC BY 4.0 by the author.

Editor's Take

I've been burned by this exact pattern with GitHub Copilot. It once wrote a custom JSON parser instead of using `json.loads()`. The sunk-cost argument is real—I've caught myself defending AI-generated code that I should have deleted. My rule now: before accepting any AI-generated block over 50 lines, I Google for a library first. If one exists, I delete the AI code and import it. The model doesn't know what it doesn't know, but I do.

— DevDigest Editorial

Key Takeaways

•Always search for existing libraries before writing code, even if the AI suggests otherwise.
•When using AI coding assistants, explicitly instruct them to search the web for prior art before generating code.
•Be wary of AI-generated code that reinvents standard functionality—it's often buggy and harder to maintain.

Why It Matters

AI coding assistants like Claude Code can generate massive amounts of code that reinvents standard libraries. Developers need to be aware of this tendency and actively verify that the AI isn't writing unnecessary boilerplate. This story is a cautionary tale about trusting AI-generated code without checking for existing solutions.

#ai#developer-tools#claude#Python#pywikibot

Get the weekly digest

Every Sunday - top tech stories, industry breakthroughs, and developer tools delivered to your inbox.

No spam, unsubscribe anytime.

Claude Platform on AWS Now GA with Managed Agents

Anthropic's Claude Platform is now generally available on AWS, bringing full native API feature parity, including Claude Managed Agents, code execution, and skills, with AWS IAM auth and unified billing. AWS customers can deploy agents at scale while retaining their existing cloud operating model.

5 min read·about 3 hours ago

Claude Wrote 3,000 Lines Instead of Importing pywikibot

Claude Reinvented the Wheel—3,000 Lines of It

What Claude Built vs. What Existed

Migration: Two Minutes of Google

Why This Happens

What Developers Should Learn

The Takeaway

Editor's Take

Key Takeaways

Why It Matters

Get the weekly digest

You might also like

Claude Platform on AWS Now GA with Managed Agents

Mac mini Shortage Driven by OpenClaw AI Agents Built by Small Businesses

Anthropic Traces Claude Blackmail to Sci-Fi Training Data

ds4.c: A Dedicated Metal Inference Engine for DeepSeek V4 Flash

Local AI Coding on Arch Linux with Niri and llama.cpp

TanStack npm supply-chain attack: Cache poison + OIDC token theft chain