Tag
The article proposes Implicit Variational Rejection Sampling (IVRS), which integrates implicit distributions with rejection sampling to improve posterior approximation in variational inference, and introduces the Implicit Resampling Evidence Lower Bound (IR-ELBO) as a tighter variational lower bound.
This paper proposes Multi-Stage In-Flight Rejection (MSIFR), a training-free framework that reduces token waste in LLM-based synthetic data generation by detecting and terminating low-quality generation trajectories at intermediate checkpoints. Across five models and seven benchmarks, MSIFR reduces token consumption by 11–77% as a standalone method and up to 78.2% when combined with early-exit methods, while preserving or improving accuracy.
This paper introduces Entrocraft, a rejection-sampling method for RL that controls entropy schedules to prevent performance saturation in LLMs. It demonstrates improved generalization and training longevity, allowing smaller models to outperform larger baselines.