Tag
This paper introduces the concept of the audit gap between behavioral safety and representation-level robustness in LLMs, proposing an intervention-based evaluation framework and the Latent Vulnerability Score (LVS) to measure hidden vulnerabilities.
Introduces Thoughts-as-Planning, a framework that models chain-of-thought optimization as sequential decision-making using latent world models and reinforcement learning, outperforming existing methods in efficiency and generalization.
Latte introduces a framework that represents personalization as forecasting a peer-anchored relative preference state using latent trajectories, injecting a soft token into a frozen LLM to achieve personalized generation. It outperforms existing personalization methods on Amazon Reviews 2023 and MemoryCD datasets.
ATLAS presents a visual reasoning framework that combines agentic operations and latent representations using functional tokens, enabling efficient training via next-token prediction and reinforcement learning while avoiding intermediate image generation.
This paper establishes nonparametric identifiability guarantees for extracting task-relevant representations from generalist models, proving that task structure is identifiable across time steps and latent representations are identifiable within each step under sparsity regularization.