Tag
RubricsTree proposes a scalable, expert-aligned evaluation framework for personal health agents using over 100 atomic Boolean rubrics, achieving up to 66% relative gains on HealthBench across Gemini, GPT, and Qwen model families.