Noroboto Attack: Lying Fonts Fool AI and How to Detect Them

Noroboto: When Your Font Lies to Your AI

What if the text you see in a document isn't what the computer reads? That's the premise of Noroboto, a new attack that exploits embedded fonts to swap visible glyphs with different Unicode characters. The result: humans see one thing, but AI agents that extract text see something else—potentially altering legal terms, jurisdictions, or contract clauses.

The attack, developed by the LegalQuants red team, targets the gap between what's rendered on screen and what's stored in Unicode. It's a fresh twist on an old problem: the complexity of document specifications (Word, PDF) and the imperfect implementations of those specs in modern tech stacks.

How Noroboto Works

The core mechanism is simple: create a TrueType font where the character map (cmap) maps visible glyphs to Private Use Area (PUA) Unicode code points. When rendered, the font provides a glyph for those PUAs, so the user sees the intended text. But when the document is copy-pasted or processed by an AI agent that reads Unicode values directly, it gets garbage.

In the full obfuscation variant, every character is replaced with a PUA code point. The researchers built a proof-of-concept with Python and ChatGPT 5.4 in a few hours. Initial 1-to-1 mappings were broken by ChatGPT 5.5 using cryptoanalysis and by exploiting a leaked 'name' field in the glyph definitions. They upgraded to a polyalphabetic cipher with 4-to-1 mappings and random assignment, plus glyph perturbations to prevent outline matching.

Even then, frontier models with inference-time computing ("thinking" mode) cracked it by rendering the document and running OCR. The key insight: total obfuscation provides enough signal to trigger alternative approaches.

More Effective: Partial Obfuscation and Replacement

The real danger lies in partial obfuscation and replacement. In partial obfuscation, only a few key terms are hidden—like an NDA clause extending confidentiality to "successors and assigns." AI agents often take the easy path: if most text looks valid, they skip expensive OCR. Some inexpensive platforms returned incorrect results for DOCX files.

Replacement is even more insidious. Instead of PUA code points, the font maps glyphs to different valid Unicode characters. In their example, the visible word "Maryland" maps to the Unicode representation of "Delaware." Every platform tested was fooled by DOCX files, and most even trusted PDFs. The researchers attribute this to agent "laziness": preferring cheap Unicode string extraction over rendering and OCR, especially for long documents.

Mitigation in Rust: Trust, but Verify

To counter this, the Tritium team (the author's company) implemented a mitigation in Rust. The approach: render all ASCII glyphs from the embedded font to an image atlas, run OCR on that atlas, and compare the OCR result to the expected ASCII string using Levenshtein distance.

fn character_accuracy(expected: &amp;str, actual: &amp;str) -&gt; f64 {
    let expected = normalize(expected);
    let actual = normalize(actual);
    let distance = strsim::levenshtein(&amp;expected, &amp;actual);
    let expected_len = expected.chars().count().max(1);
    1.0_f64 - (distance as f64 / expected_len as f64)
}

The code iterates over a set of test characters (OCR_ASCII_VALIDATION_CHARACTERS), renders each glyph using the swash crate, and builds a font atlas. It then passes the atlas to a platform-specific OCR engine (macOS/Windows native in 2026). If accuracy is less than 1.0, the font is flagged as deceptive.

The proof of concept focuses on ASCII alphanumeric characters. A production implementation would need to handle full Unicode and use a shaping library like HarfBuzz for complex scripts. The code is intentionally simple: it renders glyphs individually and concatenates them with padding, rather than using a proper layout engine.

Why This Matters

Document processing pipelines in legal tech—and increasingly in AI-powered contract review—rely on text extraction that can be fooled by malicious fonts. The Noroboto attack shows that even state-of-the-art AI agents are vulnerable when they take shortcuts. The Rust mitigation offers a practical, open-source starting point for defense: check font integrity before trusting extracted text.

For developers building document processing tools, the takeaway is clear: don't trust the Unicode string without verifying it against the rendered output. OCR is expensive but necessary when the font itself might be lying.

Next Steps

Review your document extraction pipeline for embedded font handling.
Implement glyph accuracy checks before relying on extracted text.
Consider using a font sanitization step that validates cmap mappings against rendered glyphs.
Monitor the LegalQuants repository for updates on the attack and mitigation tools.

Noroboto Attack: Lying Fonts Fool AI and How to Detect Them in Rust

Noroboto: When Your Font Lies to Your AI

How Noroboto Works

More Effective: Partial Obfuscation and Replacement

Mitigation in Rust: Trust, but Verify

Why This Matters

Next Steps

Editor's Take

Key Takeaways

Why It Matters

Get the weekly digest

You might also like

GitLost: How Prompt Injection in GitHub Agentic Workflows Leaks Private Repos

CVE-2026-53359: 16-Year-Old KVM Bug Lets Guest Escape to Host

Russian Shadow Fleet Launches Drones Over Europe: IISS Report

Windows GDID Fully Reverse Engineered: MSA Device PUID Exposed

Next.js 16 Optimistic UI: The Rapid-Click Bug That Breaks Your Toggle

Rate Limiting by IP Broke My API: Fixing Shared Provider Quotas