Tag
This research paper investigates functional metacognition in Large Language Models, demonstrating that internal states like evaluation awareness and self-assessed capability are linearly decodable from residual stream activations. The authors propose a mechanistic framework to steer these states, showing causal control over reasoning behaviors, verbosity, and safety responses.
This study presents a 33-model atlas analyzing domain-level metacognitive monitoring in frontier LLMs using MMLU benchmarks, revealing significant variations in confidence calibration across different knowledge domains that are obscured by aggregate metrics.
This paper presents WriteFlow, an AI voice-based writing assistant designed to support reflective academic writing through goal-oriented interaction, addressing limitations of efficiency-focused writing tools by scaffolding metacognitive regulation and goal articulation. Findings from a Wizard-of-Oz study with 12 expert users demonstrate that the system effectively supports iterative goal refinement and goal-text alignment during the drafting process.
A new cross-domain benchmark (Metacognitive Monitoring Battery) with 524 items evaluates LLM self-monitoring capabilities across six cognitive domains using human psychometric methodology. Applied to 20 frontier LLMs, it reveals three distinct metacognitive profiles and shows that accuracy rank and metacognitive sensitivity rank are largely inverted.