Tag
Introduces UniTok, a universal tokenizer that transforms continuous time series into discrete tokens, and UniTok-FM, a foundation model pretrained via next-token prediction that enables zero-shot and prompt-boosted forecasting as well as few-shot generation and classification through training-free in-context inference.
3Blue1Brown's new video explains that LLMs are fundamentally compression machines, linking next-token prediction to efficient encoding of human knowledge, which leads to better abstraction and reasoning.
This paper formalizes the sufficiency gap in next-token prediction, demonstrating that even ideal sequence models can become overconfident when textual prefixes are not sufficient statistics for latent circumstances. It proposes an external observer mechanism to reduce but not eliminate this gap.
A critical examination of how AI maximalists celebrate the obsolescence of human labor through next-token prediction, and the socioeconomic risks this attitude poses, particularly for vulnerable populations.
This paper distinguishes three probabilistic objects often conflated in language modeling—the full conditional language process, the marginal text-only law, and the model-induced distribution—and analyzes the conditions under which next-token prediction is useful, with RAG and tools interpreted as conditional sufficiency devices.
Explains what large language models actually do (next-token prediction) and why they sound confident even when wrong. Offers a mental model and verification checklist for using LLMs safely.
A critique of the oversimplified claim that LLMs are 'just next token predictors,' arguing that prediction at scale induces useful representations and capabilities, and that such dismissals confuse objective with learned system.
This paper introduces Conditional Attribute Transformers, a method for jointly estimating next-token probability and attribute values conditionally, enabling credit assignment, counterfactual analysis, and steerable generation in a single forward pass.
ATLAS presents a visual reasoning framework that combines agentic operations and latent representations using functional tokens, enabling efficient training via next-token prediction and reinforcement learning while avoiding intermediate image generation.
TPA proposes a novel method for detecting hallucinations in RAG systems by attributing next-token probabilities to seven distinct sources (Query, RAG Context, Past Token, Self Token, FFN, Final LayerNorm, Initial Embedding) and aggregating by Part-of-Speech tags. The approach achieves state-of-the-art performance across five LLMs including Llama2, Llama3, Mistral, and Qwen.