Formalizing Latent Thoughts: Four Axioms of Thought Representation in LLMs

Hugging Face Daily Papers 05/07/26, 12:00 AM Papers

Summary

Introduces an axiomatic evaluation framework for latent thought representations in LLMs, revealing that current representations fail to satisfy four fundamental functional axioms (Causality, Minimality, Separability, Stability) across 23 reasoning tasks, indicating a structural gap in representation quality.

We introduce an axiomatic evaluation framework for latent thought representations in LLMs, comprising metrics that are independent of downstream benchmark scores and reveal representational failures that benchmark accuracy masks. Existing evaluations conflate representation quality with model capacity. Therefore, failures cannot be attributed to the representation rather than to the model that processes it. We formalize four functional axioms (Causality, Minimality, Separability, and Stability) and define a quantitative measure for each, computed directly on the representation independently of downstream accuracy. We audit open-weight LLMs across 23 reasoning tasks (e.g., Spatial Reasoning, Factual QA). We find that no candidate satisfies all four axioms simultaneously, that the representations distinguish task type reliably but cannot distinguish between two questions within the same task, and that the representations encode little information beyond what is already present in the input embedding. The failure is consistent across dense, reasoning-distilled, and RL-trained model families, indicating that the gap is structural rather than a property of model size or training procedure.

Original Article

View Cached Full Text

Cached at: 06/29/26, 02:00 AM

Paper page - Formalizing Latent Thoughts: Four Axioms of Thought Representation in LLMs

Source: https://huggingface.co/papers/2606.27378

Abstract

An axiomatic evaluation framework reveals systematic failures in latent thought representations of LLMs across multiple reasoning tasks, demonstrating that current representations fail to satisfy fundamental functional axioms consistently across different model architectures.

We introduce anaxiomatic evaluation frameworkforlatent thought representationsinLLMs, comprising metrics that are independent ofdownstream benchmark scoresand reveal representational failures that benchmark accuracy masks. Existing evaluations conflaterepresentation qualitywithmodel capacity. Therefore, failures cannot be attributed to the representation rather than to the model that processes it. We formalize fourfunctional axioms(Causality,Minimality,Separability, andStability) and define a quantitative measure for each, computed directly on the representation independently of downstream accuracy. We auditopen-weight LLMsacross 23reasoning tasks(e.g.,Spatial Reasoning,Factual QA). We find that no candidate satisfies all four axioms simultaneously, that the representations distinguish task type reliably but cannot distinguish between two questions within the same task, and that the representations encode little information beyond what is already present in the input embedding. The failure is consistent across dense, reasoning-distilled, and RL-trained model families, indicating that the gap is structural rather than a property of model size or training procedure.

View arXiv page View PDF Project page GitHub Add to collection

Get this paper in your agent:

hf papers read 2606\.27378

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.27378 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.27378 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.27378 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Formalizing Latent Thoughts: Four Axioms of Thought Representation in LLMs

Paper page - Formalizing Latent Thoughts: Four Axioms of Thought Representation in LLMs

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

Why LLMs Hallucinate on Structured Knowledge: A Mechanistic Analysis of Reasoning over Linearized Representations

The strange thing about LLM reasoning research: we're now trying to remove the chain-of-thought traces

Learning to Refine Hidden States for Reliable LLM Reasoning

LGMT: Logic-Grounded Metamorphic Testing for Evaluating the Reasoning Reliability of LLMs

The Periodic Table of LLM Reasoning: A Structured Survey of Reasoning Paradigms, Methods, and Failure Modes

Submit Feedback

Similar Articles

Why LLMs Hallucinate on Structured Knowledge: A Mechanistic Analysis of Reasoning over Linearized Representations

The strange thing about LLM reasoning research: we're now trying to remove the chain-of-thought traces

Learning to Refine Hidden States for Reliable LLM Reasoning

LGMT: Logic-Grounded Metamorphic Testing for Evaluating the Reasoning Reliability of LLMs

The Periodic Table of LLM Reasoning: A Structured Survey of Reasoning Paradigms, Methods, and Failure Modes