Tag
The paper introduces the offloading score, a metric that measures AI reliance by quantifying the fraction of cognitive effort offloaded to an AI tool using counterfactual workflows. It is validated through intrinsic evaluations and a user study with developers, showing it detects increased reliance under time pressure better than existing measures.
This paper introduces HyperLens, a high-resolution probe to quantify cognitive effort in LLMs by tracing fine-grained confidence trajectories across layers. It reveals that complex tasks require higher cognitive effort and demonstrates how Supervised Fine-Tuning can reduce this effort, potentially degrading performance.