Tag
This paper proposes a distribution-aware training approach for modeling next-event predictions in concurrent Go programs, treating scheduler nondeterminism as a signal. Fine-tuning a 7B model on fewer than a thousand traces achieves 36.2% accuracy on production bugs, outperforming Gemini 3.5 Flash zero-shot.
This tweet discusses the idea of training models with 'implementation noise' to improve robustness against float numerics problems caused by nondeterminism and nonassociativity.