Shaping Schema via Language Representation as the Next Frontier for LLM Intelligence Expanding
Summary
This paper argues that designing advanced language representations to shape cognitive schemas is a key frontier for expanding LLM intelligence without scaling parameters. It provides formalizations and empirical evidence showing that different linguistic structures significantly impact model performance and internal feature activations.
View Cached Full Text
Cached at: 05/12/26, 10:53 AM
Paper page - Shaping Schema via Language Representation as the Next Frontier for LLM Intelligence Expanding
Source: https://huggingface.co/papers/2605.09271
Abstract
Language representation design significantly impacts large language model performance and internal feature activations, offering a promising research direction for enhancing model intelligence without scaling or parameter modifications.
Although natural language is the default medium forLarge Language Models(LLMs), its limited expressive capacity creates a profound bottleneck for complex problem-solving. While recent advancements in AI have relied heavily on scaling, merely internalizing knowledge does not guarantee its effective application. Defininglanguage representationas the linguistic and symbolic constructs used to map and model the real world, this paper argues that shapingschemas through advancedlanguage representationis the next frontier for expanding LLM intelligence. We posit that an LLM’sknowledge activationand organization -- itsschema-- depends heavily on the structural and symbolic sophistication of the language used to represent a given task. This paper contributes both a formalization of this claim and the empirical evidence to support it. With a new formalization, we present multiple lines of evidence to support our position: Firstly, we review recentempirical practicesandemerging methodologiesthat demonstrate the substantial performance gains achievable through deliberatelanguage representationdesign, even without modifying model parameters or scale. Secondly, we conductcontrolled experimentsshowing that LLM performance and itsinternal feature activationsvary under differentlanguage representations of the same underlying task. Together, these findings highlightlanguage representationdesign as a promising direction for future research.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2605\.09271
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.09271 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.09271 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.09271 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
LLM Neuroanatomy III - LLMs seem to think in geometry, not language
Researcher analyzes LLM internal representations across 8 languages and multiple models, finding that concept thinking occurs in geometric space in middle transformer layers independent of input language, supporting a universal deep structure hypothesis similar to Chomsky's theory rather than Sapir-Whorf linguistic relativism.
Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key
This paper introduces ScaleLogic, a framework demonstrating that RL training compute scales as a power law with reasoning depth in LLMs. It highlights that logical expressiveness is key to improving downstream transfer and training efficiency.
Learning to reason with LLMs
OpenAI publishes an article exploring reasoning techniques with LLMs through cipher-decoding examples, demonstrating step-by-step problem-solving approaches and pattern recognition in language models.
Towards Intrinsic Interpretability of Large Language Models: A Survey of Design Principles and Architectures
A comprehensive survey reviewing recent advances in intrinsic interpretability for Large Language Models, categorizing approaches into five design paradigms: functional transparency, concept alignment, representational decomposability, explicit modularization, and latent sparsity induction. The paper addresses the challenge of building transparency directly into model architectures rather than relying on post-hoc explanation methods.
Disentangling Mathematical Reasoning in LLMs: A Methodological Investigation of Internal Mechanisms
This paper investigates how large language models perform arithmetic operations by analyzing internal mechanisms through early decoding, revealing that proficient models exhibit a clear division of labor between attention and MLP modules in reasoning tasks.