The Numbers First

Level 13 in libdeflate saves 86,990 bytes across the Silesia corpus — a 0.134% improvement over level 12 — at the cost of 1,641,401 ms of extra CPU time. That's a 5,636.7% slowdown. The biggest relative gain is on the nci file: 0.296% smaller. The smallest is sao: 0.004%.

Why Bother?

DEFLATE is everywhere. HTTP content encoding, ZIP archives, PNG images, software distribution, backup tools, embedded systems. The decoder is fixed — you can't change the format. But the encoder has freedom: it chooses match lengths, block boundaries, and Huffman tables. Better choices mean smaller output, and smaller output means faster downloads, less storage, and lower bandwidth costs.

libdeflate is already one of the strongest practical DEFLATE encoders. Level 12 is the baseline. Level 13, contributed as a pull request, pushes the search space to extremes.

What Level 13 Actually Does

Level 13 keeps libdeflate's near-optimal parser but makes every decision more expensive:

  • Full 32 KiB window search. It scans the entire DEFLATE sliding window for matches, not just a subset.
  • Up to 15 optimisation passes. The parser iterates to refine choices.
  • Static Huffman optimisation. Blocks up to 50,000 input bytes get static Huffman table tuning.
  • Delayed block size commitment. For text-like data (no NULL byte, ≤97 distinct bytes in a 64 KiB sample), the soft block size limit rises from 300,000 to 1,000,000 bytes. The idea: stable byte distribution lets one Huffman table cover more data.
  • Broader minimum cost search. The encoder can choose the cheapest offset for each match length, estimate initial Huffman costs from literal and match length statistics, estimate offset slot frequencies from cached matches, and compare measured dynamic Huffman cost against static Huffman and literal-only encodings.
  • Delayed block splitting. Up to nine split candidates are stored with parser state. The compressor scores the full block and a bounded shortest path over candidate intervals. A single split can win on cost; a multi-split path must beat the full block by at least 512 bits. The selected parse is cached for final flushing.

Bounded Slowness

Unlike turtledeflate or broader file optimisers like ECT, level 13 caps its search passes, split candidates, and block sizes. It cannot enter an unbounded optimisation loop. The worst-case runtime is predictable.

The Zero-Regression Policy

Development followed a strict rule: every change must decrease at least one file in the Silesia corpus without increasing any other. Many approaches were tried and rejected. The final configuration is a conservative collection of heuristics that only improve compression.

Silesia Results in Detail

FileLevel 12 Size / TimeLevel 13 Size / TimeSize DiffTime Diff
dickens3,688,552 / 1,289 ms3,684,671 / 83,512 ms-3,881 (-0.105%)+82,223 (+6,378.8%)
mozilla18,267,490 / 4,959 ms18,235,120 / 110,754 ms-32,370 (-0.177%)+105,795 (+2,133.4%)
mr3,448,571 / 1,627 ms3,443,723 / 16,260 ms-4,848 (-0.141%)+14,633 (+899.4%)
nci2,766,224 / 7,935 ms2,758,044 / 673,648 ms-8,180 (-0.296%)+665,713 (+8,389.6%)
ooffice2,998,130 / 424 ms2,995,604 / 8,676 ms-2,526 (-0.084%)+8,252 (+1,946.2%)
osdb3,642,347 / 798 ms3,641,341 / 4,942 ms-1,006 (-0.028%)+4,144 (+519.3%)
reymont1,702,796 / 1,005 ms1,699,494 / 81,839 ms-3,302 (-0.194%)+80,834 (+8,043.2%)
samba5,135,662 / 2,889 ms5,122,812 / 141,227 ms-12,850 (-0.250%)+138,338 (+4,788.4%)
sao5,255,575 / 333 ms5,255,358 / 1,687 ms-217 (-0.004%)+1,354 (+406.6%)
webster11,565,754 / 6,452 ms11,555,293 / 475,196 ms-10,461 (-0.090%)+468,744 (+7,265.6%)
x-ray5,754,248 / 305 ms5,748,141 / 3,276 ms-6,107 (-0.106%)+2,971 (+974.1%)
xml633,760 / 1,104 ms632,518 / 69,504 ms-1,242 (-0.196%)+68,400 (+6,195.7%)
Total64,859,109 / 29,120 ms64,772,119 / 1,670,521 ms-86,990 (-0.134%)+1,641,401 (+5,636.7%)

When to Use Level 13

This level is for data that is compressed once and distributed many times. Think: software packages, firmware images, reference datasets, archived logs. The compression cost is amortised over millions of decompressions. For one-off transfers or frequently recompressed data, level 12 remains the practical choice.

Getting Level 13

The code is in libdeflate's repository. Compile with make and use libdeflate-gzip with -13 or the library API with LIBDEFLATE_LEVEL_13. Expect compilation times to be unaffected — the overhead is only at runtime.

What's Next

Level 13 is a dead end for further algorithmic gains within DEFLATE. The next steps are format-level changes: Brotli, Zstandard, or LZMA. But as long as DEFLATE is the universal fallback, squeezing every byte counts. If you maintain a package distribution pipeline, benchmark level 13 on your assets. The savings may justify a build-time increase.