private-benchmark

Tag

Cards List
#private-benchmark

I made a small open-source benchmark runner for testing OpenClaw agents on my own real workflows

Reddit r/openclaw · yesterday

A developer shares a personal open-source benchmark runner for testing OpenClaw agents on real, messy workflows. The tool allows users to define private evaluation cases, run agents in their actual workspace, and generate reports, aiming to provide more relevant signals than public benchmarks.

1 favorites 1 likes
← Back to home

Submit Feedback