prompt-optimization

#prompt-optimization

SAGE: Stochastic Prompt Optimization via Agent-Guided Exploration

arXiv cs.CL ↗ · 2026-06-18 Cached

Introduces SPO, a stochastic search framework for automatic prompt optimization, with three strategies including SAGE, an agent-guided multi-agent pipeline. Evaluated on benchmarks and deployed on a mental-health chatbot, showing improvements in retention through continuous optimization.

0 favorites 0 likes

#prompt-optimization

@NFTCPS: Microsoft came up with something called SkillOpt, and its approach is pretty wild: treating an agent's skill documentation like a neural network for training, with epochs, batches, learning rates, and validation sets, but without touching a single model weight. What makes it great? Let me break it down into three points: Training only modifies one skill document, and any new changes must be validated on the...

X AI KOLs Timeline ↗ · 2026-06-17 Cached

Microsoft introduces SkillOpt, a method that trains an agent's skill documentation like a neural network, using epochs, batches, learning rates, and validation sets for optimization, without modifying model weights. It achieves top results across multiple benchmarks and can be transferred across models and tools.

0 favorites 0 likes

#prompt-optimization

@denziideng: Just discovered: So that's how prompts should be written! Turn your prompts from "casual scribbles" into professional, reusable assets. Every time you use AI to write, generate images, or do analysis and just toss in a prompt, the results are hit or miss. So annoying... Now with this tool, you can one-click optimize prompts, auto-test, compare, iterate, and permanently save them as reusable...

X AI KOLs Timeline ↗ · 2026-06-17 Cached

Introducing Prompt Optimizer, an open-source tool that helps users optimize, test, and reuse prompts. It supports multi-platform deployment and transforms prompts from one-time use into assets that can be called repeatedly.

0 favorites 0 likes

#prompt-optimization

FAPO: Fully Autonomous Prompt Optimization of Multi-Step LLM Pipelines

Hugging Face Daily Papers ↗ · 2026-06-17 Cached

FAPO is a framework for fully autonomous prompt optimization of multi-step LLM pipelines, combining prompt editing and structural changes. It outperforms the GEPA baseline in 15 of 18 comparisons, with gains up to +33.8 pp on security tasks.

0 favorites 0 likes

#prompt-optimization

Graph-based Target Back-Propagation for Context Adaptation in Multi-LLM Agentic Systems

arXiv cs.LG ↗ · 2026-06-15 Cached

The paper proposes GTBP, a graph-based back-propagation framework for context adaptation in multi-LLM agentic systems, which improves prompt optimization with theoretical convergence guarantees and outperforms existing methods on benchmarks.

0 favorites 0 likes

#prompt-optimization

APEX: Automated Prompt Engineering eXpert with Dynamic Data Selection

arXiv cs.CL ↗ · 2026-06-11 Cached

APEX introduces a dynamic data selection strategy for automatic prompt optimization, stratifying datasets into easy, hard, and mixed tiers to improve data efficiency, achieving significant performance gains over initial prompts on multiple benchmarks.

0 favorites 0 likes

#prompt-optimization

Levi: Run AlphaEvolve on your local QWEN 30B

Reddit r/LocalLLaMA ↗ · 2026-06-08

LEVI is an open-source AlphaEvolve-like system that runs locally on Qwen3-30B, offering code and prompt optimization with up to 35x cost reduction and better performance than existing frameworks.

0 favorites 0 likes

#prompt-optimization

RECAP: Regression Evaluation for Continual Adaptation of Prompts

arXiv cs.LG ↗ · 2026-06-08 Cached

Introduces RECAP, a benchmark for evaluating continual learning of prompts under evolving constraints in a proactive adaptation setting. Results show that existing prompt optimization methods fail in this setting, highlighting the need for new methods.

0 favorites 0 likes

#prompt-optimization

CRAFT: Cost-aware Refinement And Front-aware Tuning of Prompts

arXiv cs.CL ↗ · 2026-06-04 Cached

CRAFT is a Pareto-front prompt optimizer that jointly optimizes for accuracy and token cost, avoiding the 'scalarization collapse' of weighted-sum approaches by maintaining a diverse population of prompts across the accuracy-cost trade-off frontier using NSGA-II and budget-aware validation.

0 favorites 0 likes

#prompt-optimization

SePO: Self-Evolving Prompt Agent for System Prompt Optimization

arXiv cs.CL ↗ · 2026-06-04 Cached

SePO (Self-Evolving Prompt Optimization) proposes a self-referential prompt agent that optimizes both task agents' system prompts and its own system prompt through an evolutionary search, outperforming Manual-CoT, TextGrad, and MetaSPO across five benchmarks including AIME'25, ARC-AGI-1, and GPQA.

0 favorites 0 likes

#prompt-optimization

From Demonstrations to Rewards: Test-Time Prompt Optimization for VLM Reward Models

arXiv cs.LG ↗ · 2026-06-02 Cached

Proposes Demo2Reward, a test-time prompt optimization technique for VLM reward models using a few expert demonstrations, significantly reducing false positives and improving policy learning in robotics without additional model training.

0 favorites 0 likes

#prompt-optimization

Learnable Assessment Skills for LLM-based Automated Scoring: Rubric Construction via Iterative Optimization

arXiv cs.CL ↗ · 2026-05-29 Cached

This paper proposes learning assessment skills for LLMs to automate rubric construction for scoring tasks, achieving performance comparable to expert-written rubrics without requiring human-written examples.

0 favorites 0 likes

#prompt-optimization

Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text

arXiv cs.CL ↗ · 2026-05-29 Cached

Introduces eXTC, a text classifier with three progressive stages: structured prompt optimization to learn a natural-language rulebook, reasoning distillation into a compact LM, and reinforcement learning to expand reasoning, achieving strong performance and interpretability.

0 favorites 0 likes

#prompt-optimization

Why Prompt Optimization Works, and Why It Sometimes Doesn't: A Causal-Inspired Edit-Level Analysis

arXiv cs.CL ↗ · 2026-05-27 Cached

This paper conducts a causal-inspired analysis of automated prompt optimization across frameworks, LLMs, and tasks, identifying that specific edit types (e.g., complexity-increasing, meta-instructional) have systematic negative or positive effects depending on task characteristics, explaining generalization failures.

0 favorites 0 likes

#prompt-optimization

SPEAR: Code-Augmented Agentic Prompt Optimization

arXiv cs.CL ↗ · 2026-05-27 Cached

SPEAR is a code-augmented agentic prompt optimizer that uses a Python sandbox for structural error analysis, achieving state-of-the-art performance on multiple LLM evaluation suites including industrial judge tasks, BBH, and GSM8K.

0 favorites 0 likes

#prompt-optimization

@omarsar0: New research from Microsoft Research I see a lot of AI engineers handwriting agent skill docs and hope they generalize.…

X AI KOLs Following ↗ · 2026-05-25 Cached

Microsoft Research introduces SkillOpt, a method that treats agent skill documents as trainable external state, using an optimizer model to make bounded edits validated by a held-out set. The approach achieves best or tied results across 52 evaluation cells and improves accuracy by over 23 points on GPT-5.5, with zero extra inference cost and transferable skills.

0 favorites 0 likes

#prompt-optimization

When Gradients Collide: Failure Modes of Multi-Objective Prompt Optimization for LLM Judges

Hugging Face Daily Papers ↗ · 2026-05-25 Cached

This paper identifies two failure modes in multi-objective prompt optimization for LLM judges using textual gradients: gradient dilution during optimization and instruction interference during inference, showing that joint gradient processing loses criterion-specific information.

0 favorites 0 likes

#prompt-optimization

Reflective Prompt Tuning through Language Model Function-Calling

arXiv cs.CL ↗ · 2026-05-22 Cached

Introduces Reflective Prompt Tuning (RPT), a framework that uses LLM function-calling to iteratively diagnose and revise prompts based on systematic error patterns, improving reasoning task performance and calibration.

0 favorites 0 likes

#prompt-optimization

Solving the Credit Assignment Problem in Multi-Agent Systems (CANTANTE Framework)

Reddit r/AI_Agents ↗ · 2026-05-20

CANTANTE is an open-source framework that solves the credit assignment problem in multi-agent systems by converting system-level rewards into per-agent update signals, outperforming DSPy-based baselines on coding and math reasoning benchmarks.

0 favorites 0 likes

#prompt-optimization

CANTANTE: Optimizing Agentic Systems via Contrastive Credit Attribution [R]

Reddit r/MachineLearning ↗ · 2026-05-20

CANTANTE introduces a contrastive credit attribution method to optimize multi-agent LLM systems by decomposing global rewards into per-agent signals, enabling automated prompt tuning. It outperforms baselines on programming, math, and retrieval benchmarks, achieving up to +18.9 points improvement without increased inference cost.

0 favorites 0 likes

prompt-optimization

Submit Feedback