CANTANTE: Optimizing Agentic Systems via Contrastive Credit Attribution [R]
Summary
CANTANTE introduces a contrastive credit attribution method to optimize multi-agent LLM systems by decomposing global rewards into per-agent signals, enabling automated prompt tuning. It outperforms baselines on programming, math, and retrieval benchmarks, achieving up to +18.9 points improvement without increased inference cost.
Similar Articles
Solving the Credit Assignment Problem in Multi-Agent Systems (CANTANTE Framework)
CANTANTE is an open-source framework that solves the credit assignment problem in multi-agent systems by converting system-level rewards into per-agent update signals, outperforming DSPy-based baselines on coding and math reasoning benchmarks.
Reducing Credit Assignment Variance via Counterfactual Reasoning Paths
Introduces Implicit Behavior Policy Optimization (IBPO), a counterfactual comparison-based credit assignment framework that improves training stability and performance in multi-step reasoning tasks for large language models by converting sparse terminal rewards into step-sensitive learning signals.
Contrastive Attribution in the Wild: An Interpretability Analysis of LLM Failures on Realistic Benchmarks
Researchers apply contrastive LRP-based attribution to analyze why LLMs fail on realistic benchmarks, finding the method gives useful signals in some cases but is not universally reliable.
@NousResearch: Today we release Contrastive Neuron Attribution (CNA), a method for steering LLM behavior by identifying and ablating s…
NousResearch releases Contrastive Neuron Attribution (CNA), a method to steer LLM behavior by ablating sparse MLP circuits without training autoencoders or degrading benchmarks, validated on refusal circuits across models up to 70B parameters.
Targeted Neuron Modulation via Contrastive Pair Search
Contrastive neuron attribution (CNA) identifies a sparse set of MLP neurons that distinguish harmful from benign prompts, enabling effective behavioral steering in instruction-tuned LLMs without degrading output quality. The method reduces refusal rates by over 50% on jailbreak benchmarks while preserving fluency.