Tag
This paper proposes a system that combines a prerequisite knowledge graph with a PPO-based policy to structure Socratic tutoring with LLMs, showing improved student mastery and efficiency over heuristic and frontier model baselines.
A Stanford Law School study found that law professors rated LLM-generated answers higher than peer answers in a blinded evaluation of short-answer tutoring in contracts courses, with LLMs winning 75.33% of comparisons and being flagged as harmful less often.
Introduces EduAgentBench, a source-grounded benchmark for evaluating tutor agents across professional pedagogical judgment, multi-turn tutoring, and autonomous teaching workflow execution. Evaluations on frontier models show they still fall short of professional teaching standards in situated tutoring and workflow tasks.