essay-scoring

#essay-scoring

Early-Token Confidence Predicts Reasoning Quality in Multi-Agent LLM Debate

arXiv cs.CL ↗ · yesterday Cached

This paper investigates whether early-token confidence signals from LLM decoding can predict reasoning quality in multi-agent debate systems, finding that confidence in the first few generated tokens is the strongest predictor of rubric-based essay scores.

0 favorites 0 likes

#essay-scoring

Towards Robust Argumentative Essay Understanding via TIDE: An Interactive Framework with Trial and Debate

arXiv cs.AI ↗ · 2026-05-19 Cached

This paper introduces TIDE, a novel framework that integrates trial and debate mechanisms to improve criteria-based prompt optimization for argumentative essay understanding tasks such as automated essay scoring, argument component detection, and argument relation identification. Experiments show performance improvements, highlighting the potential of combining prompt-based methods for robust argument analysis.

0 favorites 0 likes

essay-scoring

Early-Token Confidence Predicts Reasoning Quality in Multi-Agent LLM Debate

Towards Robust Argumentative Essay Understanding via TIDE: An Interactive Framework with Trial and Debate

Submit Feedback