Tag
This paper from eBay presents a modular two-agent simulation framework for evaluating conversational shopping assistant architectures, enabling controlled comparisons of responder designs. Key findings include that rolling-window memory outperforms intent-extraction memory by 35% in speed, and that systematic failure analysis reduced failure rates by 62%.