Tag
This paper presents a five-arm ablation methodology for diagnosing which component of retrieval-warmed energy-based reasoning (RW-EBR) drives performance gains, applied to structured reasoning tasks like graph reachability and Sudoku. The method separates effects of class-prior bias, stochastic warm-starting, and graph-aligned value reuse.
This paper presents NebulaExp, a transparent ablation-driven post-training pipeline for 8B-scale LLMs, covering SFT, GRPO RL, and multi-teacher distillation. It identifies key trade-offs between mathematical reasoning and code generation, and demonstrates that data correctness filtering is the first-order optimization factor.
This paper theoretically and empirically examines adaptive patching for time-series Transformers, deriving conditions under which content-adaptive tokenization should outperform tuned uniform patching. Controlled experiments on standard benchmarks show that a well-tuned uniform baseline is competitive with dynamic patching methods, challenging the assumed benefit of adaptive approaches.
A research report detailing controlled experiments on building an external memory architecture that enables persistent AI identity independent of model weights, finding that accumulated fragment history consistently dominates system prompts in shaping output across three topologies.
This paper analyzes 935 ablation experiments from 161 publications to show that AI architectural evolution follows the same statistical laws as biological evolution, including heavy-tailed fitness effect distributions and punctuated equilibria dynamics. The findings suggest that evolutionary statistical structure is substrate-independent, determined by fitness landscape topology rather than the mechanism of selection.