Capability Conditioned Scaffolding for Professional Human LLM Collaboration
Summary
Introduces Capability Conditioned Scaffolding, a framework for LLM collaboration that adapts intervention based on user expertise domains to prevent Professional Domain Drift, with pilot evaluation on MMLU subsets.
View Cached Full Text
Cached at: 05/18/26, 06:31 AM
# Capability Conditioned Scaffolding for Professional Human LLM Collaboration Source: [https://arxiv.org/abs/2605.15404](https://arxiv.org/abs/2605.15404) [View PDF](https://arxiv.org/pdf/2605.15404) > Abstract:Large language model personalization typically adapts outputs to user preferences and style but does not account for differences in user evaluation capacity across domains of expertise\. This limitation can encourage Professional Domain Drift, where users rely on AI generated reasoning in domains they cannot reliably evaluate\. We introduce Capability Conditioned Scaffolding, a typed framework that partitions expertise into strong, mixed, and weak domains and conditions intervention behavior on structured capability profiles\. A pilot evaluation across multiple MMLU subsets and four LLM substrates shows consistent profile conditioned intervention behavior, including categorical inversion under profile swapping and selective activation in mixed domain risk zones\. These findings suggest that capability aware scaffolding can support more reliable professional human AI collaboration beyond stylistic personalization\. ## Submission history From: Sen Yang \[[view email](https://arxiv.org/show-email/414fc87a/2605.15404)\] **\[v1\]**Thu, 14 May 2026 20:42:03 UTC \(559 KB\)
Similar Articles
Coordinates of Capability: A Unified MTMM-Geometric Framework for LLM Evaluation
This Systematization of Knowledge paper proposes a unified Multi-Trait Multi-Method (MTMM) geometric framework for evaluating Large Language Models, unifying disparate metrics into a shared latent coordinate space to address construct validity issues in current benchmarks.
LLMs Know When They Know, but Do Not Act on It: A Metacognitive Harness for Test-time Scaling
This paper proposes a metacognitive harness that separates monitoring from reasoning in LLMs, using pre-solve feeling-of-knowing and post-solve judgment-of-learning signals to control when to trust, retry, or aggregate answers, improving accuracy on text, code, and multimodal benchmarks without parameter updates.
Domain-level metacognitive monitoring in frontier LLMs: A 33-model atlas
This study presents a 33-model atlas analyzing domain-level metacognitive monitoring in frontier LLMs using MMLU benchmarks, revealing significant variations in confidence calibration across different knowledge domains that are obscured by aggregate metrics.
Learning Transferable Latent User Preferences for Human-Aligned Decision Making
This paper introduces CLIPR, a framework that learns transferable latent user preferences from minimal conversational input to improve human-aligned decision making in LLMs.
SkillMaster: Toward Autonomous Skill Mastery in LLM Agents
This paper introduces SkillMaster, a training framework that enables LLM agents to autonomously create, refine, and select skills through trajectory-informed review and counterfactual utility evaluation.