Tag
OR-Space is a benchmark for evaluating large language model agents in industrial operations research workflows, focusing on multi-stage task lifecycles and persistent workspaces beyond simple text generation.
A technical guide introducing Agent Hooks, a concept for adding deterministic control points to agent workflows via lifecycle hooks, allowing developers to enforce rules and run validations at key moments.