theoretical-guarantees

#theoretical-guarantees

Reinforcement Learning with Pairwise Preferences in Long-Term Decision Problems

arXiv cs.LG ↗ · 5d ago Cached

This paper introduces the Markov decision contest, a new problem model for reinforcement learning with pairwise preferences. It proves optimality guarantees for stationary policies, exact solvability in P, and presents a learning-efficient approximate algorithm.

0 favorites 0 likes

#theoretical-guarantees

Uncertainty Quantification for Large Language Diffusion Models

arXiv cs.CL ↗ · 2026-05-15 Cached

This paper presents the first systematic study of uncertainty quantification (UQ) for Large Language Diffusion Models (LLDMs), proposing lightweight zero-shot uncertainty signals derived from the iterative denoising process and showing that LLDMs can achieve both fast inference and reliable hallucination detection with up to 100x lower computational overhead compared to sampling-based baselines.

0 favorites 0 likes

#theoretical-guarantees

Population Risk Bounds for Kolmogorov-Arnold Networks Trained by DP-SGD with Correlated Noise

arXiv cs.LG ↗ · 2026-05-14 Cached

This paper establishes the first population risk bounds for Kolmogorov-Arnold Networks trained with mini-batch SGD and DP-SGD using correlated noise, advancing theoretical understanding of KANs in privacy-sensitive domains.

0 favorites 0 likes

theoretical-guarantees

Reinforcement Learning with Pairwise Preferences in Long-Term Decision Problems

Uncertainty Quantification for Large Language Diffusion Models

Population Risk Bounds for Kolmogorov-Arnold Networks Trained by DP-SGD with Correlated Noise

Submit Feedback