agentic-framework

Tag

Cards List
#agentic-framework

@dlouapre: Meet physics-intern, our agentic framework for theoretical physics. It takes Gemini 3.1 Pro from 17.7% to 31.4% on Crit…

X AI KOLs Following · yesterday

Physics-intern is an agentic framework for theoretical physics that improves Gemini 3.1 Pro's performance on the CritPt benchmark from 17.7% to 31.4%, achieving a new state-of-the-art.

0 favorites 0 likes
#agentic-framework

AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration -- Learning from Cheap, Optimizing Expensive

Hugging Face Daily Papers · yesterday Cached

This paper introduces AutoLLMResearch, an agentic framework that automates the configuration of expensive LLM experiments by learning from low-fidelity environments and extrapolating to high-cost settings. It aims to reduce computational waste and reliance on expert intuition in scalable LLM research.

0 favorites 0 likes
#agentic-framework

FoodCHA: Multi-Modal LLM Agent for Fine-Grained Food Analysis

arXiv cs.AI · 5d ago Cached

This paper introduces FoodCHA, a multi-modal LLM agent framework designed for fine-grained food analysis, addressing challenges in hierarchical consistency and attribute discrimination for dietary monitoring.

0 favorites 0 likes
#agentic-framework

Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language

Hugging Face Daily Papers · 2026-04-21 Cached

Chat2Workflow introduces a benchmark and agentic framework for generating executable visual workflows from natural language, showing current LLMs struggle with industrial-grade automation despite intent capture.

0 favorites 0 likes
#agentic-framework

Discover and Prove: An Open-source Agentic Framework for Hard Mode Automated Theorem Proving in Lean 4

arXiv cs.CL · 2026-04-20 Cached

This paper introduces Discover and Prove (DAP), an open-source agentic framework for automated theorem proving in Lean 4 that tackles 'Hard Mode' problems where the answer must be discovered independently before formal proof construction. The work releases new Hard Mode benchmark variants and achieves state-of-the-art results while revealing a significant gap between LLM answer accuracy (>80%) and formal prover success (<10%).

0 favorites 0 likes
#agentic-framework

MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation

Hugging Face Daily Papers · 2026-04-16 Cached

MM-WebAgent is a hierarchical agentic framework that generates coherent and visually consistent webpages by coordinating AIGC-based element generation through joint optimization of layout and multimodal content. The paper introduces a benchmark and multi-level evaluation protocol, demonstrating improvements over code-generation and agent-based baselines.

0 favorites 0 likes
← Back to home

Submit Feedback