Tag
OR-Space is a benchmark for evaluating large language model agents in industrial operations research workflows, focusing on multi-stage task lifecycles and persistent workspaces beyond simple text generation.