The Ticket That Wasn't a Bug
A frontend ticket landed on an Arabic reader's queue: mixed-content Arabic prose on a dashboard rendered with a ragged left edge (the rag falls left in Arabic, since lines set from the right margin; the ticket said "ragged right"). The design team specified justified text. Three browser screenshots showed the problem. The Latin-script version looked "fine."
The developer spent half an hour with the DOM, setting text-align: justify across font-family and direction declarations. The reply: "This isn't a bug in our stylesheet. It's the state of Arabic typography on the web."
That ticket sat atop the same iceberg as three others: a customer's name unjoined on a printed agreement because the PDF library predated a shaping engine; a search index returning empty for 12,000 names because a 2017 import used 1991 Unicode codepoints instead of 1995 ones. The ragged-left ticket was the smallest of four, but it pointed at the same thing.
What the Scribes Solved
Classical Arabic typography justifies a line without stretching word spaces. Stretched spaces are the Latin convention. In Arabic, scribes extend letterforms along the baseline using taṭwīl (kashida): connecting strokes between certain letter pairs lengthen to carry the line to the margin. A well-set 17th-century Naskh page has every line flush at both margins, with word spaces untouched.
This was a system with a paper trail. Ibn Muqla, Abbasid vizier and chief calligrapher, wrote it down around 940 CE. His system, al-khaṭṭ al-mansūb (proportional script), measured every letterform in rhombic dots of the reed nib. The elongation has its own rules: which letter pairs accept it, how the curve swells, how many elongations a line may carry. Justification is a shaping problem, not a spacing problem.
The tradition was refined by named humans over 600 years. Ibn al-Bawwāb in Baghdad (c. 1022) produced the manuscript that defined Naskh for a millennium. Yāqūt al-Mustaʿṣimī, who survived the Mongol sack of Baghdad in 1258 by climbing a minaret and continuing to write, codified the Six Pens. Persian scribes invented Nastaʿlīq in the 14th century, which justifies by sloping the baseline downward at the end of each phrase. The Ottomans developed Dīwānī, which fills space by interleaving letters at heights ordinary baselines never visit.
Four Shapes for Every Letter
Arabic is cursive always. There is no print-versus-handwriting distinction. Letters connect in stone inscriptions, manuscripts, metal, and screens. Each letter changes shape depending on neighbours: isolated, initial, medial, final. Six letters refuse to connect forward, breaking words into joined clusters.
Unicode gives one codepoint per letter. The font carries four positional glyphs. A shaping engine applies OpenType features (isol, init, medi, fina, plus rlig for ligatures, mark and mkmk for vowel signs) at render time. An Arabic font is a small program. The text you store is its input, not its output.
The Technical Debt Stack
1. Shaping engines. The web relies on HarfBuzz (or platform equivalents). HarfBuzz is good but not perfect. It doesn't implement kashida justification automatically. CSS text-align: justify stretches spaces, period. No browser exposes a kashida-justify property. The only workaround is inserting U+0640 TATWEEL characters manually, which is what the mockup in the article did — by hand, per line.
2. Font stacks. An Arabic font must cover Persian, Urdu, Pashto, Kurdish, Uyghur, Kashmiri, and Punjabi. Each adds letters or changes shapes. A font that ignores Persian and Urdu communities produces text that is technically rendered but functionally wrong: wrong kaf terminal, wrong heh fusion, wrong digits. Noto Sans Arabic ships separate sub-fonts (NotoNaskhArabic, NotoNastaliqUrdu, NotoSansArabicUI). OS font fallback chains usually get it right — usually.
3. Unicode normalization. The 2017 import that used 1991 codepoints instead of 1995 ones is a classic example. The earlier codepoints are "fossil" forms that modern shaping engines treat as different strings. The fix: normalize to NFC or NFD and ensure all text uses modern codepoints.
4. PDF libraries. Many server-side PDF generators use old HarfBuzz versions or no shaping engine at all. The result: unjoined letters that look like a 1962 sign-painter's layout. The fix: upgrade to a library that supports OpenType shaping (e.g., pdfkit with harfbuzz, or WeasyPrint with Pango).
What Can You Do?
- Use a proper Arabic font like Amiri (open-source, 150 KB, self-hosted). It includes kashida glyphs. But even Amiri won't auto-justify.
- For justified text, consider server-side rendering with a library that supports kashida insertion. The article's author manually placed U+0640 TATWEEL characters. That's the only reliable method today.
- Normalize Unicode to NFC. Run
iconv -f UTF-8 -t NFCon all imported text. - Test with real text. Don't assume Latin testing covers Arabic. Use tools like HarfBuzz's
hb-viewto inspect shaping. - Upgrade PDF libraries to ones using HarfBuzz >= 2.0. Check your PDF generator's shaping engine version.
The State of Play in 2026
The CSS Working Group has discussed text-justify: kashida but it's not in any spec. Browser vendors show no urgency. The problem is hard: kashida placement requires understanding letter pairs, line metrics, and aesthetic rules that took scribes centuries to codify. Until a spec ships, the workaround is manual or server-side.
The article's author spent "the most enjoyable couple of weeks" tracing the history. The takeaway: Arabic typography on the web is not a bug to be fixed. It's a 1000-year-old system that web standards have not yet caught up with.

