Tag
A new benchmark paper 'SWE Context Bench' tests whether coding agents can reuse knowledge across tasks, highlighting a gap in existing benchmarks that only evaluate isolated problem-solving. The author discusses solutions like external memory and mentions tools such as langmem, mem0, supermemory, and Greplica.