problem-recognition

Tag

Cards List
#problem-recognition

KWBench: Measuring Unprompted Problem Recognition in Knowledge Work

Hugging Face Daily Papers · 2026-04-17 Cached

KWBench introduces a benchmark of 223 professional tasks to evaluate whether LLMs can recognize the underlying game-theoretic structure of a situation without prompting, finding that even the best model succeeds on only 27.9% of tasks. The benchmark targets unprompted problem recognition—a step prior to task execution—across domains like acquisitions, clinical pharmacy, and fraud analysis.

0 favorites 0 likes
← Back to home

Submit Feedback