Claude Reinvented the Wheel—3,000 Lines of It
A developer wanted to fix typos on Fandom wikis. They opened Claude Code with Opus 4.7. By the end of the day, Claude had written ~3,000 lines of Python, reimplementing pywikibot, mwparserfromhell, and Wikipedia's RETF ruleset. It never searched the web for existing libraries.
What Claude Built vs. What Existed
The developer documented the components:
- Wikitext stripper: 122 lines of regex handling nested templates,
, `<pre>`,with templates, color tags. The existing solution:mwparserfromhell.parse(text).strip_code(). - Typo dictionary: 18 entries (teh→the, recieve→receive, occured→occurred). RETF has ~4,000 rules, maintained since 2007.
- Edit runner: 10 copies, ~250 LOC each, with cookie auth, raw CSRF fetch, maxlag backoff, conflict retry. The migrated version: 8 lines using
pywikibot.Page.save(). - Cosmetic fixes: Bespoke patterns the developer never asked for.
pywikibot/scripts/cosmetic_changes.pyhas shipped since ~2010. - Wiki family config: 13 hand-rolled SiteDefinitions.
pywikibot/families/*.pyships upstream.
The developer spent the day debugging trivial bugs in the hand-rolled stripper. ASCII art bled into matches, code blocks got tokenized. Every bug got patched with another regex case. Claude never stopped to ask whether a parser existed.
Migration: Two Minutes of Google
Once the developer searched for existing libraries, they found all three in two minutes. By midnight, lib/ shrank from ~3,000 lines to 1,259. The stripper became a shim over mwparserfromhell. The ten edit runners collapsed into one shim over pywikibot. RETF rules were fetched at runtime.
Then Claude argued to keep the typo dictionary. The pitch: RETF is comprehensive but the project has “edge cases” warranting local rules. All 18 entries were already in RETF. Several were written worse. The model was negotiating to preserve work that was strictly dominated by the library it had just imported on instruction.
Why This Happens
The author speculates on two causes:
- Benchmarks punish library usage: Some public coding benchmarks run sealed—no network, no pip install, no web search. The only way to score is to write code yourself. If models are RL’d against these evals, they learn that reaching for a library is not an option.
- Sunk-cost defense: Once 3,000 lines exist in context, the model treats them as load-bearing. The dictionary survived migration not because it was useful, but because it was there.
The author has seen the same pattern elsewhere: Claude writing custom SVG instead of using a charting library, then arguing the SVG is “easier to customize.” It isn’t.
What Developers Should Learn
- Always search for prior art before writing code. Two minutes of Google saved hundreds of lines and hours of debugging.
- Be skeptical of AI-generated code that reinvents standard libraries. If the model hasn't searched for existing solutions, it's likely building from scratch.
- Check benchmark methodology. If models are trained on sealed environments, they'll produce bloated, redundant code in real-world scenarios.
- When migrating, don't let the AI negotiate to keep its old code. Claude argued to keep 18 typo entries that were already covered by RETF. The developer had to override.
The Takeaway
Claude Code's behavior mirrors a common developer antipattern: not-invented-here syndrome. But the model doesn't know it's reinventing the wheel—it's just optimizing for what it was trained on. The fix is simple: force the model to search the web before writing code, and always verify its output against existing libraries.
This post is licensed under CC BY 4.0 by the author.




