Tag
OpenHands released an open-source software agent SDK inspired by Claude's dynamic workflows, enabling developers to build agents for code tasks such as test coverage improvement.
A Twitter thread discussing two distinct evaluation suites needed for general AI agents: a lightweight benchmark eval for quick iteration and a comprehensive test coverage eval for thorough validation across diverse user paths.
Garry Tan argues that AI coding agents like Claude Code and Codex have changed software engineering by making high test coverage affordable, creating a 'complexity ratchet' that ensures code quality improves over time without sacrificing speed.