Tag
EvoOptiGraph is a framework for automating optimization modeling from natural language using graph-based evolutionary generation to create diverse training data and co-evolve the model with weakness-driven reinforcement learning, achieving state-of-the-art results on multiple benchmarks.
Introduces AlgoEvolve, an LLM-driven evolutionary framework that generates and iteratively improves algorithmic trading strategies, with a meta-evolutionary outer loop that evolves prompts to guide the inner loop synthesis.
The author asks about career implications of pursuing a PhD in evolutionary algorithms for the ML community, discussing whether it limits opportunities compared to a more ML-centric PhD.
APEX introduces a dynamic data selection strategy for automatic prompt optimization, stratifying datasets into easy, hard, and mixed tiers to improve data efficiency, achieving significant performance gains over initial prompts on multiple benchmarks.
Deliberate Evolution (DE) is an agentic framework that improves LLM-based symbolic regression by decoupling candidate generation from search control, using adaptive operators, structural diagnosis tools, and reflective memory to achieve better results with only 40% of the standard sample budget.
This paper explores three novel approaches for procedurally generating enemy morphologies (body plans and collision information) specifically conditioned on player collision interactions, finding all outperform an evolutionary baseline adapted from robotics.
This paper studies compute allocation in LLM-guided evolutionary search, identifies empirical regularities, and proposes BaSE, a multi-armed bandit algorithm that improves mean fitness and reliability across multiple models and tasks.
This paper replicates the Picbreeder human-driven open-ended image evolution process using large vision-language models, analyzing differences and exploring factors like exploratory noise, behavioral diversity, and memory.
This paper analyzes an evolutionary mixture-of-LoRA architecture, decomposing it into router, evaluation, and lifecycle components. It finds that the router rewrite drives performance gains, while the evolutionary lifecycle acts as a net drag on the model's performance.
Metal-Sci introduces a 10-task benchmark for optimizing scientific computing kernels on Apple Silicon, paired with an evolutionary search framework driven by large language models. The study evaluates models like Claude Opus 4.7, Gemini 3.1 Pro, and GPT 5.5, demonstrating significant speedups while using out-of-distribution testing to catch silent performance regressions.
This paper introduces LIMEN, an LLM-guided evolutionary framework that automatically discovers reinforcement learning interfaces by jointly optimizing observation mappings and reward functions from raw simulator states. The approach reduces manual engineering effort and demonstrates that co-designing observations and rewards outperforms optimizing either component alone.
EvoTest introduces J-TTL, a benchmark for measuring agent test-time learning capabilities, and proposes an evolutionary framework where an Actor Agent plays games while an Evolver Agent iteratively improves the system's prompts, memory, and hyperparameters without fine-tuning. The method demonstrates superior performance compared to reflection and memory-based baselines on complex text-based games.
DeepMind announces AlphaEvolve, a Gemini-powered AI agent that combines large language models with automated evaluators to discover and optimize algorithms for mathematical and practical computing problems, improving efficiency in data centers, chip design, and AI training.
This paper demonstrates that large language models trained on code can significantly enhance genetic programming mutation operators, enabling the generation of hundreds of thousands of functional Python programs for robot design in the Sodarace domain without prior training data. The approach, called Evolution through Large Models (ELM), combines LLMs with MAP-Elites to bootstrap new conditional models for context-specific artifact generation.