diagnostic-framework

#diagnostic-framework

ToolSense: A Diagnostic Framework for Auditing Parametric Tool Knowledge in LLMs

arXiv cs.AI ↗ · 6d ago Cached

ToolSense is an open-source diagnostic framework that generates three benchmarks (realistic retrieval, MCQ probing, QA probing) to audit LLMs' parametric tool knowledge, revealing a knowledge-retrieval dissociation where strong retrieval performance can coexist with poor factual understanding.

0 favorites 0 likes

#diagnostic-framework

What Do Biomedical NER and Entity Linking Benchmarks Measure? A Corpus-Centric Diagnostic Framework

arXiv cs.CL ↗ · 2026-05-21 Cached

This paper presents a corpus-centric diagnostic framework for analyzing biomedical NER and EL benchmarks, revealing substantial differences across nine corpora and arguing that standard statistics are insufficient for characterizing evaluation demands.

0 favorites 0 likes

diagnostic-framework

ToolSense: A Diagnostic Framework for Auditing Parametric Tool Knowledge in LLMs

What Do Biomedical NER and Entity Linking Benchmarks Measure? A Corpus-Centric Diagnostic Framework

Submit Feedback