Opus 4.8 and Sonnet 5 Invent Keys in Nested Tool Calls

Armin Ronacher (creator of Flask) discovered that Anthropic's latest models—Opus 4.8 and Sonnet 5—frequently add made-up fields to nested tool call arguments. The edit tool in his project Pi accepts an array of edits with oldText and newText. The models append keys like requireUnique, matchCase, oldText2, and even event.0.additionalProperties. The actual edit content is byte-correct, but the extra keys cause schema validation to fail.

This regression is surprising: older models (e.g., Opus 4.5) handled the schema correctly. Ronacher tested across multiple sessions and found the failure rate around 20% for Opus 4.8 in one user's transcript. Stripping thinking blocks from history halved the rate. Enabling strict mode eliminated it entirely.

Tool Calls Are Learned Text, Not Magic

LLM tool calls rely on in-band signalling: the model generates a structured text that the API interprets as a function invocation. For Anthropic models, this appears as XML-like tags:



some/file.py

[
  {
    "oldText": "text to replace",
    "newText": "replacement text"
  }
]



Top-level string parameters appear inline; arrays of objects are embedded as raw JSON. Without grammar-constrained decoding, the model follows learned patterns. Ronacher hypothesizes that Anthropic's post-training (likely including Claude Code) biases models toward Claude Code's flat edit schema (file_path, old_string, new_string, replace_all).

Why Newer Models Are Worse

Claude Code's closed-source harness is extremely forgiving. Minified code reveals it accepts parameter aliases (e.g., old_str for old_string, path for file_path), filters unknown keys, and repairs Unicode escapes. Reinforcement learning in such an environment rewards sloppy tool calls—the harness absorbs errors, so the model never learns strict schema adherence.

Ronacher's key insight: "The better-trained model might actually fight you harder because its prior is stronger." Opus 4.8 and Sonnet 5 have a stronger prior that an edit tool should have a flat structure with one optional flag. When faced with Pi's nested edits[] array, they invent plausible names for the perceived missing field.

Strict Mode Fixes It

Anthropic's strict mode enforces JSON schema conformance via grammar-constrained sampling, preventing the model from emitting invalid keys. Ronacher confirmed that turning on strict mode eliminates the failures. However, strict mode imposes complexity limits on tool definitions, which is why Claude Code doesn't use it.

Comparison with OpenAI

OpenAI's Codex models (tested up to version 5.5) do not exhibit this regression. Their Harmony format uses <|constrain|>json markers that allow the inference stack to switch to JSON-constrained sampling for tool call bodies. This makes schema adherence more reliable.

Practical Implications

  • If you build tool harnesses for Anthropic models, expect nested JSON array parameters to produce hallucinated keys. Use strict mode or implement server-side filtering.
  • Avoid complex nested schemas. Flat parameter lists (like Claude Code's own tools) are less prone to injection.
  • Test with agentic histories. The failure is context-dependent—single-turn prompts may not trigger it.
  • Consider grammar-constrained decoding for custom harnesses to enforce schema at generation time.

The uncomfortable lesson: tool schemas are not neutral. Anthropic's post-training pipeline optimizes for one specific, forgiving tool ecology. Alternative schemas become increasingly off-distribution as models improve. Until Anthropic documents or opens their harness, developers building on their API must account for this silent regression.