Tag
Introduces EduAgentBench, a source-grounded benchmark for evaluating tutor agents across professional pedagogical judgment, multi-turn tutoring, and autonomous teaching workflow execution. Evaluations on frontier models show they still fall short of professional teaching standards in situated tutoring and workflow tasks.
Describes a training technique involving spike-aware pedagogy rewards that penalize implausible jumps, and surprisal-gated imitation where the student learns easy tokens quickly and hard ones slowly.
The author discusses the lack of detailed, complete proofs in advanced mathematics textbooks, which creates unnecessary barriers for students and professionals, and advocates for the creation of more accessible accompaniment notes.
Research showing that iterative training of student-teacher neural networks produces interpretable teaching strategies, with the teacher learning to select or generate pedagogical examples that humans can understand and learn from effectively.