huff12 - a 12-stream Huffman decoder for Apple Silicon
Summary
huff12 is a 12-stream Huffman decoder optimized for Apple Silicon processors, aiming to improve decoding performance through parallel stream processing.
Similar Articles
Unsloth on Apple Silicon- Pre-announcement announcement
Unsloth, a popular LLM fine-tuning library, announces upcoming support for Apple Silicon devices, expanding its optimization capabilities beyond NVIDIA GPUs.
I ported EXL3 to run well on Apple Silicon - PonyExl3
Ported the EXL3 LLM codec to run on Apple Silicon via Metal, achieving high prefill and generation speeds on M5 Max (e.g., ~600 tok/s prefill, 17-80 tok/s gen on various models).
@port_dev: https://x.com/port_dev/status/2054259445732110408
The article provides a detailed tutorial on setting up a local coding agent using Qwen3.6-27B via Unsloth Studio and the Pi coding harness. It highlights the benefits of using GGUF quantized models for efficient inference on consumer hardware like Apple Silicon Macs.
PivCo-Huffman
This paper presents PivCo-Huffman, a new approach to Huffman coding using pivot coding from wavelet trees, enabling high-performance SIMD-friendly encoding and decoding. It consistently outperforms state-of-the-art Huffman codecs and shows how ANS coding can be selectively applied to skewed nodes to approach ANS compression ratios while preserving high decompression speeds.
PivCo-Huffman “merge” operations
This blog post analyzes the PivCo-Huffman paper, which introduces 'merge' operations for parallel Huffman decoding, enabling efficient vectorized and GPU-friendly decoding without interleaving overhead.