Tag
MIT CSAIL researchers introduce RLCR, a method using Brier scores in reinforcement learning to train AI models to output calibrated confidence estimates, significantly reducing overconfidence without sacrificing accuracy.
A PhD student at ICLR seeks practical tactics to overcome social anxiety and break into existing conversation groups without generic confidence advice.
Two ICLR 2026 papers show how small RL-trained agents outperform frontier models on machine-learning engineering tasks and how MLE-Smith automatically scales MLE workloads.