Tag
This article explores using the gzip compression algorithm as a language model, demonstrating that compression algorithms can generate text by scoring candidate continuations based on compressed length, using beam search to produce output.
ThinkBooster is a unified framework for test-time compute scaling of LLM reasoning, providing a modular Python library, a performance-efficiency benchmark, an OpenAI-compatible proxy service, and a visual debugger. Empirical results on math and coding tasks demonstrate practical gains with quality-cost trade-offs.
This paper investigates how 1D coarse-to-fine token structures in autoregressive models improve test-time search efficiency compared to classical 2D grid tokenization. The authors show that such ordered tokens enable better test-time scaling and even training-free text-to-image generation when guided by image-text verifiers.