Tag
This paper introduces Constrained Diffusion for Code (CDC), a training-free neurosymbolic inference framework that integrates constraint satisfaction directly into the reverse denoising process of discrete diffusion models for code generation. CDC consistently improves constraint satisfaction in functional correctness, security, and syntax across benchmarks, outperforming existing diffusion and autoregressive baselines.
LaMR introduces a structured pruning framework for coding agents that decomposes code relevance into semantic evidence and dependency support dimensions, using dedicated CRFs and a mixture-of-experts gate to reduce token usage by up to 31% while maintaining or improving task performance.
The article explains why Tree-sitter is unsuitable for deep program analysis, highlighting how it discards critical tokens like operators and keywords. It advocates for using the Cubix framework as a more robust alternative for building semantic analysis and refactoring tools.
Practitioner Rory Sawyer reflects on a decade of applying program analysis to bridge the gap between code and human intent, emphasizing static analysis as a communication tool for correctness beyond execution.