personal-health-agents

Tag

Cards List
#personal-health-agents

RubricsTree: Scalable and Evolving Open-Ended Evaluation of Personal Health Agents across Health Memory and Medical Skills

arXiv cs.CL · 4d ago Cached

RubricsTree proposes a scalable, expert-aligned evaluation framework for personal health agents using over 100 atomic Boolean rubrics, achieving up to 66% relative gains on HealthBench across Gemini, GPT, and Qwen model families.

0 favorites 0 likes
← Back to home

Submit Feedback