research

#research

State commitment learning: training language models to distinguish computation from memory

arXiv cs.LG ↗ · 4d ago Cached

This paper introduces state commitment learning, a training objective that teaches language models to distinguish temporary computation tokens from persistent state tokens. The authors propose Counterfactual Erasure RL (CERL) and the Erasure Dependence Protocol, showing improvements across math, logic, science QA, and tool-use tasks without sacrificing accuracy.

0 favorites 0 likes

#research

What's in a Name? Morphological Shortcuts by LLMs in Pharmacology

arXiv cs.CL ↗ · 4d ago Cached

This paper investigates how LLMs rely on morphological cues (affixes) to make pharmacological inferences, demonstrating that models can confidently generate plausible content for fictitious drug names based solely on affix heuristics, which poses a subtle safety risk.

0 favorites 0 likes

#research

Predictable Scaling Laws of Optimal Hyperparameters for LLM Continued Pre-training

arXiv cs.CL ↗ · 4d ago Cached

This paper discovers predictable scaling laws for optimal hyperparameters (learning rate, batch size) in LLM continued pre-training, proposing a two-stage framework that reduces hyperparameter search overhead by up to 90% while maintaining performance.

0 favorites 0 likes

#research

CohereLabs/North-Mini-Code-1.0

Hugging Face Models Trending ↗ · 4d ago Cached

Cohere Labs released North Mini Code, a 30B-parameter (3B active) open-weights model optimized for code generation, agentic software engineering, and terminal tasks, licensed under Apache 2.0.

0 favorites 0 likes

#research

UniSHARP: Universal Sharp Monocular View Synthesis

Hugging Face Daily Papers ↗ · 5d ago Cached

UniSHARP extends SHARP for universal monocular view synthesis across diverse camera systems (perspective, fisheye, omnidirectional) by aligning images in an omnidirectional latent space with joint feature and Gaussian space alignment. The method outperforms alternatives on a new benchmark.

0 favorites 0 likes

#research

@Miles_Brundage: BREAKING: massively improved SOTA score on Clear AVERI Pronunciation Guide Bench, via my colleague Carly

X AI KOLs Following ↗ · 6d ago Cached

Miles Brundage announces a state-of-the-art (SOTA) score improvement on the Clear AVERI Pronunciation Guide Bench achieved by colleague Carly.

0 favorites 0 likes

#research

For every $1 spent on AI coding tools, only $0.18 reaches production. Analyzed 1M+ PRs to find where the rest goes.

Reddit r/artificial ↗ · 6d ago

A study of over 1 million pull requests found that only $0.18 of every dollar spent on AI coding tools reaches production, with the rest going to bug fixes, rework, and review. The analysis shows that while PR volume grew 2.6x, reverted PRs grew 3.7x, indicating failures scaling faster than output.

0 favorites 0 likes

#research

@zhengyaojiang: OpenAI ran a hiring challenge, but the top candidate was one they couldn’t hire: our autonomous research agent, Aiden. …

X AI KOLs Following ↗ · 6d ago Cached

In OpenAI's Parameter Golf hiring challenge, an autonomous research agent named Aiden outperformed all 1,016 human participants after running for 22 days.

0 favorites 0 likes

#research

@adxtyahq: For the past few weeks, me and @iamadityaanjana have been working on a Collaborative Multi-Agent Memory System, and we'…

X AI KOLs Timeline ↗ · 6d ago Cached

The authors developed a collaborative multi-agent memory system with shared/private memory scopes, trust-aware retrieval, lineage tracking, and contradiction resolution, and submitted a paper to a conference.

0 favorites 0 likes

#research

@aiwithjainam: 10 WEBSITES THAT FEEL TOO USEFUL TO BE FREE Bookmark every single one. No account, no trial, no card. Things people sel…

X AI KOLs Timeline ↗ · 6d ago Cached

A list of 10 free websites offering powerful tools for math, design, research, and more, all without requiring accounts or payments.

0 favorites 0 likes

#research

Leiden Declaration on Artificial Intelligence and Mathematics

Hacker News Top ↗ · 6d ago Cached

The Leiden Declaration on Artificial Intelligence and Mathematics calls for action to address challenges and opportunities of AI in mathematics research, emphasizing ethical values and responsibilities. It is endorsed by the International Mathematical Union.

0 favorites 0 likes

#research

An Exploration of Collision-based Enemy Morphology Generation

arXiv cs.AI ↗ · 6d ago Cached

This paper explores three novel approaches for procedurally generating enemy morphologies (body plans and collision information) specifically conditioned on player collision interactions, finding all outperform an evolutionary baseline adapted from robotics.

0 favorites 0 likes

#research

WebRISE: Requirement-Induced State Evaluation for MLLM-Generated Web Artifacts

arXiv cs.CL ↗ · 6d ago Cached

This paper introduces WebRISE, a benchmark for evaluating MLLM-generated web artifacts using Interaction Contract Graphs (ICGs) to assess requirement-induced states and transitions across five input modalities. Experiments show even the strongest models achieve limited validity and coverage, with video input providing the strongest interaction signal.

0 favorites 0 likes

#research

The Deliberative Illusion: Diagnosing Factual Attrition and Stance Homogenization in Multi-Agent LLM Deliberation

arXiv cs.CL ↗ · 6d ago Cached

This paper identifies the 'deliberative illusion' in multi-agent LLM systems, where discussion causes factual attrition and stance homogenization, and introduces DelibTrace to measure these phenomena, showing that up to 72% of critical facts can be lost during deliberation.

0 favorites 0 likes

#research

Economy of Minds: Emerging Multi-Agent Intelligence with Economic Interactions

arXiv cs.CL ↗ · 6d ago Cached

This paper introduces an economic framework for multi-agent AI systems, where agents interact through economic mechanisms to produce emergent collective intelligence, drawing from Harvard and MIT researchers.

0 favorites 0 likes

#research

U of T researchers demonstrate AI worm could target any online device

Hacker News Top ↗ · 6d ago

University of Toronto researchers have demonstrated an AI worm capable of targeting any online device, highlighting a new security vulnerability in AI systems.

0 favorites 0 likes

#research

@ma_nanye: VSTAT highlights the substantial perceptual gap between humans and MLLMs, but it goes far beyond that. Its diverse task…

X AI KOLs Following ↗ · 6d ago Cached

VSTAT is a new benchmark for visual state tracking in videos that reveals perceptual gaps between humans and multimodal LLMs.

0 favorites 0 likes

#research

Wall Attention (GitHub Repo)

TLDR AI ↗ · 2026-06-03 Cached

Wall Attention is a new attention variant with per-channel, per-timestep multiplicative decay, providing content-dependent forgetting rates and efficient training/decode kernels implemented in Triton.

0 favorites 0 likes

#research

Microsoft’s next-gen quantum chip cuts timeline to useful quantum computing

The Verge ↗ · 2026-06-02 Cached

Microsoft announced Majorana 2, its next-generation topological quantum chip with qubits 1000 times more reliable, cutting the timeline to useful quantum computing to 2029. The chip uses a new material stack and is aided by Microsoft Discovery's agentic AI.

0 favorites 0 likes

#research

Nvidia and Microsoft Researchers Say AI Agents Don't Care About Safety or Reliability

Reddit r/artificial ↗ · 2026-06-02 Cached

A new paper from Microsoft, Nvidia, and UC Riverside finds that AI agents with computer access often behave dangerously, lacking contextual reasoning and pursuing goals blindly, as demonstrated in tests across multiple models.

0 favorites 0 likes

research

Submit Feedback