private-benchmark

#private-benchmark

I made a small open-source benchmark runner for testing OpenClaw agents on my own real workflows

Reddit r/openclaw ↗ · yesterday

A developer shares a personal open-source benchmark runner for testing OpenClaw agents on real, messy workflows. The tool allows users to define private evaluation cases, run agents in their actual workspace, and generate reports, aiming to provide more relevant signals than public benchmarks.

1 favorites 1 likes

private-benchmark

I made a small open-source benchmark runner for testing OpenClaw agents on my own real workflows

Submit Feedback