Tag
This paper identifies two failure modes in multi-objective prompt optimization for LLM judges using textual gradients: gradient dilution during optimization and instruction interference during inference, showing that joint gradient processing loses criterion-specific information.