Tag
Introduces PhysAssistBench, a benchmark for evaluating LLMs in interactive doctor-patient-EHR assistance. Experiments show current models are unreliable in this setting, highlighting the need for coordinated capabilities.