Tag
ProgramBench is a new benchmark that tests AI agents' ability to reconstruct a complete codebase from a compiled binary and its documentation. The leaderboard will open for submissions soon.