Tag
Introduces LingxiDiagBench, a large-scale multi-agent benchmark for evaluating LLMs on Chinese psychiatric consultation and diagnosis. Key findings show high accuracy on binary classification but poor performance on multi-way differential diagnosis, highlighting a decoupling between conversational quality and diagnostic accuracy.
WiseMind is a knowledge-guided multi-agent framework that uses LLMs for psychiatric diagnosis by combining a "Reasonable Mind" agent for evidence-based logic with an "Emotional Mind" agent for empathetic communication, achieving 85.6% diagnostic accuracy on simulated and real patient interactions. The framework leverages DSM-5 structured knowledge graphs to reduce hallucinations and outperforms single-agent baselines by 15-54 percentage points while maintaining clinical soundness and psychological support.