Tag
A deep dive into a bug where the Polish letter Ś disappeared on Medium, tracing the issue through typewriter history, communism, and Unicode normalization.
This paper presents the first systematic evaluation of cross-family speculative decoding for Polish LLMs on Apple Silicon, extending MLX-LM with UAG to enable cross-tokenizer decoding. It finds that context-aware token translation improves acceptance rates, but unified memory bandwidth limitations prevent theoretical speedup amortization, with best results showing 1.7x throughput gains for structured text.