agent-harness

Tag

Cards List
#agent-harness

@samhogan: https://x.com/samhogan/status/2055064462844219603

X AI KOLs Timeline · yesterday Cached

HALO uses RLMs to optimize AI agent harnesses by analyzing execution traces and suggesting improvements, achieving 10%+ gains on several benchmarks like Terminal-Bench and AppWorld.

0 favorites 0 likes
#agent-harness

@rajistics: Token costs are climbing. How do you avoid being locked into a single vendor's harness? Built a demo showing how @OpenH…

X AI KOLs Following · yesterday Cached

A demo showing how OpenHands acts as a control plane across multiple agent harnesses like Claude Code, Gemini CLI, and OpenHands itself, enabling swapping models or vendors without rewriting orchestration.

0 favorites 0 likes
#agent-harness

@dzhng: INTRODUCING: Duet Agent A new type of harness we're building at @duetchat Perfect for jobs that don't fit in one chat: …

X AI KOLs Following · 2d ago Cached

Duetchat introduces Duet Agent, a new harness for running long-duration AI agent tasks with state machine relay, memory compaction, and a stateless runner for sandboxes.

0 favorites 0 likes
#agent-harness

built a CLI based agent harness for video editing

Reddit r/AI_Agents · 3d ago

Vex is an open-source CLI agent harness that lets users edit videos via natural language commands, automating tasks like silence removal, b-roll addition, and visual generation.

0 favorites 0 likes
#agent-harness

@geekbb: Auto-optimization tool for Agent harness. It takes over the heavy lifting of harness optimization: you provide a benchmark command and a target repository, and it automatically generates proposals, runs evaluations, records results, keeps the best, discards the rest, and automatically improves the agent's prompts, configurations, and source code. https…

X AI KOLs Timeline · 4d ago Cached

autoharness is an automated agent harness optimization tool that automatically generates proposals and runs evaluations based on benchmark commands to improve an agent's prompts, configurations, and source code. It supports Codex and Claude.

0 favorites 0 likes
#agent-harness

@SaitoWu: https://x.com/SaitoWu/status/2053423773332947153

X AI KOLs Timeline · 5d ago Cached

This article introduces Factory's Missions system, a multi-agent collaboration framework designed for long-term software engineering tasks. It addresses the drift issues commonly faced by traditional agents in long-cycle tasks through structured verification and handover mechanisms.

0 favorites 0 likes
#agent-harness

ReFlect: An Effective Harness System for Complex Long-Horizon LLM Reasoning

arXiv cs.AI · 2026-05-08 Cached

This paper introduces ReFlect, a training-free harness system that wraps LLMs with deterministic error detection and recovery logic to improve performance on complex, long-horizon reasoning tasks.

0 favorites 0 likes
← Back to home

Submit Feedback