creative-software

Tag

Cards List
#creative-software

DeskCraft: Benchmarking Desktop Agents on Professional Workflows and Human-in-the-Loop Collaboration

arXiv cs.AI · 2d ago Cached

DeskCraft is a new benchmark for evaluating desktop GUI agents on long-horizon professional creative workflows, incorporating human-in-the-loop collaboration protocols. It tests agents on tasks requiring over 50 steps across design, video, audio, and 3D software.

0 favorites 0 likes
← Back to home

Submit Feedback