Tag
This paper presents a case study of using a large language model (Claude Code) to formalize Grothendieck's vanishing theorem in the Lean theorem prover. It finds that while agents can produce verified code, they struggle with definitions and API design, emphasizing the need for expert review beyond mere compilation.