Hellishly Slow Level 13 Deflate Compression

Hacker News Top Tools

Summary

The article describes libdeflate's new level 13, a deliberately slow DEFLATE compression level that achieves marginally better compression (0.134% on Silesia) at the cost of being 56x slower than level 12, designed for scenarios where data is compressed once and decompressed many times.

No content available
Original Article
View Cached Full Text

Cached at: 06/27/26, 03:49 AM

# Hellishly Slow Level 13 DEFLATE Compression Source: [https://kirill.korins.ky/articles/hellishly-slow-level-13-deflate-compression/](https://kirill.korins.ky/articles/hellishly-slow-level-13-deflate-compression/) ## Abstract DEFLATE level 13 is a deliberately impractical libdeflate compression level: the output remains standard DEFLATE, while the encoder spends far more time searching parse, Huffman, and block split choices\. On Silesia it saves 86'990 bytes, 0\.134%, over level 12 and runs 56\.4x slower; this cost is acceptable only when data is compressed once and distributed many times\. ## Why DEFLATE DEFLATE remains worth optimising because its decoders are already everywhere: HTTP content encoding, ZIP archives, PNG internals, software distribution, backup tools, and embedded formats still use the same LZ77 plus Huffman design\. The decoder contract is fixed, but the encoder still chooses matches, block boundaries, and Huffman tables; better choices improve size without changing compatibility\. The baseline is[libdeflate](https://github.com/ebiggers/libdeflate)level 12, one of the strongest practical DEFLATE encoders\. The level 13 implementation was contributed upstream as[pull request](https://github.com/ebiggers/libdeflate/pull/452)\. ## Level 13 Mechanics Level 13 keeps libdeflate’s near optimal parser, but makes its choices more expensive\. It searches the full 32 KiB DEFLATE window, permits 15 optimisation passes, and applies static Huffman optimisation to blocks up to 50'000 input bytes\. No format extension is involved\. For text like data, level 13 delays block size commitment\. It samples up to 64 KiB from the current block start; if the sample contains no`NULL`byte and at most 97 distinct byte values, the soft block size rises from 300'000 to 1'000'000 bytes\. The assumption is simple: a stable byte distribution lets one Huffman table cover more data\. The parser broadens the minimum cost search\. It can choose the cheapest offset for each match length, estimate initial Huffman costs from literal and match length statistics, estimate offset slot frequencies from cached matches, and compare measured dynamic Huffman cost against static Huffman and literal only encodings\. Block splitting is also delayed\. The compressor stores up to nine split candidates with parser state, then scores the full block and a bounded shortest path over candidate intervals\. A single split can win on cost; a multi split path must beat the full block by at least 512 bits\. The selected parse is cached for final flushing\. The slowness is bounded\. Search passes, split candidates, and block sizes are capped, so level 13 cannot enter an unbounded optimisation loop like[turtledeflate](https://github.com/rwillenbacher/turtledeflate)or broader file optimizers such as[ECT](https://github.com/fhanau/Efficient-Compression-Tool)\. ## Regression Policy Development used a zero compression regression policy against[the Silesia corpus](https://sun.aei.polsl.pl/~sdeor/index.php?page=silesia): many approaches were tried, but only changes that strictly decreased at least one compressed file, without increasing any other compressed file, survived into the final level 13 configuration\. The Silesia corpus is small enough for repeated development, but mixed enough to punish single file tuning: it contains text, binaries, databases, images, and structured data\. ## Silesia Results libdeflate level 12 versus level 13 on Silesiafilelevel 12 size / timelevel 13 size / timesize difftime diffdickens3'688'552 / 1'289 ms3'684'671 / 83'512 ms\-3'881 \(\-0\.105%\)\+82'223 \(\+6378\.8%\)mozilla18'267'490 / 4'959 ms18'235'120 / 110'754 ms\-32'370 \(\-0\.177%\)\+105'795 \(\+2133\.4%\)mr3'448'571 / 1'627 ms3'443'723 / 16'260 ms\-4'848 \(\-0\.141%\)\+14'633 \(\+899\.4%\)nci2'766'224 / 7'935 ms2'758'044 / 673'648 ms\-8'180 \(\-0\.296%\)\+665'713 \(\+8389\.6%\)ooffice2'998'130 / 424 ms2'995'604 / 8'676 ms\-2'526 \(\-0\.084%\)\+8'252 \(\+1946\.2%\)osdb3'642'347 / 798 ms3'641'341 / 4'942 ms\-1'006 \(\-0\.028%\)\+4'144 \(\+519\.3%\)reymont1'702'796 / 1'005 ms1'699'494 / 81'839 ms\-3'302 \(\-0\.194%\)\+80'834 \(\+8043\.2%\)samba5'135'662 / 2'889 ms5'122'812 / 141'227 ms\-12'850 \(\-0\.250%\)\+138'338 \(\+4788\.4%\)sao5'255'575 / 333 ms5'255'358 / 1'687 ms\-217 \(\-0\.004%\)\+1'354 \(\+406\.6%\)webster11'565'754 / 6'452 ms11'555'293 / 475'196 ms\-10'461 \(\-0\.090%\)\+468'744 \(\+7265\.1%\)x\-ray5'754'248 / 305 ms5'748'141 / 3'276 ms\-6'107 \(\-0\.106%\)\+2'971 \(\+974\.1%\)xml633'760 / 1'104 ms632'518 / 69'504 ms\-1'242 \(\-0\.196%\)\+68'400 \(\+6195\.7%\)total64'859'109 / 29'120 ms64'772'119 / 1'670'521 ms\-86'990 \(\-0\.134%\)\+1'641'401 \(\+5636\.7%\)Level 13 saves 86'990 bytes across the corpus, a 0\.134% reduction, and adds 1'641'401 ms of runtime\. The strongest relative result is`nci`at 0\.296%;`sao`changes by only 0\.004%\.

Similar Articles

OpenZL

Lobsters Hottest

OpenZL is a compression library that generates specialized compressors for specific data formats, achieving high compression ratios at high speeds suitable for datacenter workloads like AI processing.

Data Compression Explained (2012)

Hacker News Top

A comprehensive book explaining data compression techniques including information theory, coding methods, modeling, and transforms, targeting programmers with math skills.

DIVE: Embedding Compression via Self-Limiting Gradient Updates

arXiv cs.CL

Proposes DIVE, a compression adapter for embedding dimensionality reduction that uses self-limiting gradient updates and head-wise NT-Xent contrastive loss to prevent overfitting on small datasets, outperforming existing methods on BEIR benchmarks.