ThoughtTrace: Understanding User Thoughts in Real-World LLM Interactions
Summary
ThoughtTrace introduces a large-scale dataset pairing real-world multi-turn human-AI conversations with users' self-reported thoughts, enabling improved user behavior prediction and personalized assistant training through thought-guided rewrites.
View Cached Full Text
Cached at: 05/21/26, 06:12 AM
Paper page - ThoughtTrace: Understanding User Thoughts in Real-World LLM Interactions
Source: https://huggingface.co/papers/2605.20087
Abstract
ThoughtTrace presents a large-scale dataset pairing human-AI conversations with self-reported thoughts, enabling improved user behavior prediction and personalized assistant training through thought-guided rewrites.
Conversational AIhas now reached billions of users, yet existing datasets capture only what people say, not what they think. We introduce ThoughtTrace, the firstlarge-scale datasetthat pairs real-world multi-turn human--AI conversations with users’ self-reported thoughts: their reasons for sending prompts and reactions to assistant responses. ThoughtTrace comprises 1,058 users, 2,155 conversations, 17,058 turns, and 10,174 thought annotations collected across 20 language models. Our analysis shows that ThoughtTrace captures long-horizon, topically diverse interactions, and that thoughts are semantically distinct from messages, difficult for frontier LLMs to infer from context, diverse in content, and tied to conversation stages. We further demonstrate the utility of thoughts for downstream modeling. First, thoughts improveuser-behavior predictionas inference-time context. Second,thought-guided rewritesprovide fine-grained alignment signals for trainingpersonalized assistants. Together, ThoughtTrace establishes user thoughts as a new data modality for studying thecognitive dynamicsbehind human--AI interaction and provides a foundation for building assistants that better understand and adapt to users’latent goals,preferences, andneeds.
View arXiv pageView PDFProject pageGitHub3Add to collection
Get this paper in your agent:
hf papers read 2605\.20087
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.20087 in a model README.md to link it from this page.
Datasets citing this paper1
#### SCAI-JHU/ThoughtTrace Viewer• Updatedabout 6 hours ago • 2.16k • 1.39k • 3
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.20087 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
@thinkymachines: While Lilian is telling a story, the interaction model can track when she is thinking, yielding, self-correcting, or in…
The article highlights a research update describing an interaction model capable of tracking cognitive states like thinking, yielding, and self-correction during storytelling without a built-in dialogue management system.
How we made continuous trace intelligence possible at scale (8 minute read)
Braintrust's Topics feature uses LLM summarization to make production agent traces tractable for clustering and classification at scale, inspired by Anthropic's Clio approach.
"I didn't Make the Micro Decisions": Measuring, Inducing, and Exposing Goal-Level AI Contributions in Collaboration
Introduces CoTrace, a framework for goal-level attribution in human-AI collaboration, which analyzes how large language models shape goals by contributing concrete requirements and indirect influences in dialogue turns.
Does Theory of Mind Improvement Really Benefit Human-AI Interactions? Empirical Findings from Interactive Evaluations
This paper proposes a new interactive evaluation paradigm for Theory of Mind in LLMs, finding that improvements on static benchmarks do not translate to better performance in dynamic human-AI interactions, highlighting the need for interaction-based assessments.
A Model of Multi-turn Human Persuadability Using Probabilistic Belief Tracing
This paper introduces PersuasionTrace, a framework for studying multi-turn persuasion in human-LLM interaction, using a Bayesian-network simulated target that models belief updates. The framework reveals that LLMs are persuasive across topics and modalities, and that the Bayesian target better matches human belief dynamics than vanilla LLM simulators.