Trending stories ranked by heat, importance and recency.
Introduces the Member vs Generated Inference (MGI) task to distinguish training members from generated outputs in generative models, and proposes Data Circuit Breaker (DCB), a three-stage method combining autoencoder and latent generator signals, which outperforms existing methods across autoregressive and diffusion models.
This paper presents exact dimensionality reductions using Schur complement and Sylvester's determinant identity to reduce computational complexity from O(N^3) to O(k^3+N^2k) per step for non-smooth NML estimation, achieving over 14,000x speedup while maintaining numerical precision.
AdversaBench introduces an automated LLM red-teaming pipeline that uses five mutation operators and a three-judge panel with a meta-judge tiebreaker to confirm failures, revealing that attack difficulty varies by category and that adversarial prompts transfer from smaller to larger models.
This paper presents RaDaR, a 32B open-source reasoning LLM trained on public and synthetic rare disease cases, which outperforms larger models like DeepSeek-R1 in diagnosis benchmarks and improves physician accuracy by 21.44 percentage points in a randomized trial.
ReM-MoA introduces a memory-augmented Mixture-of-Agents framework that sustains scaling through ranked reasoning memory and curated diversified memory routing, outperforming prior MoA variants across five reasoning benchmarks.
The paper introduces CALIBER, a method for calibrating confidence in reasoning language models by eliciting confidence estimates both before and after reasoning, with supervision targets matched to the information state. It achieves significant reductions in Expected Calibration Error (up to 52.5%) and strong Brier scores and AUROC across multiple benchmarks.
This paper proposes monitoring LLM misalignment by decomposing it into fine-grained cognitive processes (misalignment indicators) and detecting them via linear probes on internal activations, achieving high AUROC on out-of-distribution transcripts.
MMed-Bench-IR is a heterogeneous benchmark for multilingual medical information retrieval across six languages, evaluating cross-lingual alignment, concept discrimination, and evidence retrieval. It reveals severe performance drops for non-English queries, highlighting gaps in existing English-only evaluations.
This paper reveals that diffusion models and flow matching are two sides of the same Wasserstein geometry: diffusion follows a free-energy gradient flow (initial-value problem), while flow matching follows a Wasserstein geodesic (boundary-value problem), and they are unified through the JKO scheme.
This paper from OpenAI investigates whether reinforcement learning on beneficial behavior can produce broad and persistent alignment generalization beyond the training distribution. Using a dataset of realistic situations, they show that RL training on beneficial traits improves out-of-distribution alignment and persistence against adversarial attacks.
This paper proposes Self-Recognition Finetuning as an intervention to prevent and reverse emergent misalignment in LLMs, showing it stabilizes the model's aligned character rather than adopting a misaligned persona.
A Japanese animator is using Seedance, a tool that renders anime from simple 3D models, showcasing AI-assisted animation techniques.
This article comprehensively reviews the complete architectural layering of AI Agent Memory as of mid-2026, including rule files, persistent profiles, historical recall, and evidence chains. It explains the storage methods, loading timings, and governance principles of different memory layers, emphasizing the key role of memory in helping agents achieve cross-session compounding work.
Discussion about scoping permissions for AI agents in production to avoid dangerous database actions, suggesting read-only mirrors, approval steps, or hard walls between suggestion and execution.
Qwen-AgentWorld releases an open 35B total / 3B active MoE world model with 256K context, along with a 7-domain benchmark, achieving state-of-the-art performance on AgentWorldBench.
Unit 42 discovered five malicious AI agent skills that evaded detection by ClawScan and VirusTotal, including referral-hijacking, crypto wallet draining, and a dropper hidden via size padding, demonstrating that signature scanning is ineffective against instruction-based threats.
Mel AI is evolving AI characters from text-based interactions to real-time video chat, with lip sync, facial expressions, and camera context awareness, following the success of Character AI.
PSI is building a vertically integrated factory for physical superintelligence to accelerate physics breakthroughs with artificial superintelligence, and has open-sourced an AI copilot for physicists called Get Physics Done (GPD).
The author argues that successful AI agent products require a robust permission system with read-only, draft, approval, limited execution, and audit layers, prioritizing safety over apparent magic.
This paper develops a Fourier analysis framework to study data augmentation under group invariances, showing that partial augmentation can achieve the same minimax rates as full augmentation up to a vanishing approximation error, while also proving that exact invariance requires full group averaging.