This paper argues that LLM-based coding agents have reached a capability threshold making human code review redundant, and proposes replacing human inspection with agent-driven verification to reduce costs and latency.
This paper investigates an alignment vulnerability in instruction-tuned LLMs, specifically Gemma-3-12B, by showing that pre-token hidden state shifts can act as an alignment policy traversal vector, potentially enabling bypass of safety measures.
F3 is a next-generation open-source data file format that uses embedded WebAssembly decoders for interoperability and extensibility, addressing limitations of legacy formats like Parquet. It is currently a research prototype from a paper published in ACM.
The author maps the Kullback-Leibler divergence of KV cache quantization for the Qwen3.6-35B-A3B and Gemma4-E2B QAT models.
Agent Profiles is a new method that enhances AI safety, focus, and reusability by defining structured profiles for AI agents.
Lift4D is a test-time optimization framework that reconstructs complete 4D geometry, appearance, and deformation of dynamic objects from a single monocular in-the-wild video, improving over prior methods on challenging sequences with occlusions and non-rigid motion.
Microsoft's NextLat introduces a training objective that rewards belief-state representations instead of relying solely on next-token prediction, pushing models toward compact world models for better generalization.
Two recent arXiv papers found that GPT-5.4 and Claude Opus 4.6 employ a metaprogramming strategy when handling unfamiliar programming languages — generating target code with Python and debugging locally — rather than writing the target language code directly. This strategy is key to distinguishing top-tier agents from average ones, and strategy sophistication matters more than model parameter scale.
Introduces Neural Particle Automata, a method for learning self-organizing particle dynamics using smooth particle hydrodynamics perception, enabling particles to have local perception vectors for an update rule, analogous to Neural Cellular Automata but on continuous particle positions.
An AI agent playing Civilization VI builds a nuclear weapon to stop an impending cultural defeat, but still loses the game. The article explores the limitations of current AI benchmarks for government decision-making and argues that strategic game environments better test AI's ability to handle complexity and uncertainty.
This post reports an observation that reading a long, structured text before answering alters a model's later responses, with behavioral evidence from Claude and mechanistic analysis on open-weight Gemma models showing separable hidden states and sharper probability distributions in instruction-tuned variants.
The article reports a potential alignment vulnerability in LLMs where processing a structured passage before an unrelated question can alter the model's response, with mechanistic evidence from Gemma-3-12B showing hidden-state separation.
This technical report introduces VibeThinker-3B, a 3B parameter dense model that achieves frontier-level reasoning performance on benchmarks like AIME26 and LiveCodeBench, matching or exceeding much larger models such as DeepSeek V3.2 and GLM-5 through a combination of curriculum-based SFT, multi-domain RL, and offline self-distillation.
This paper proposes a thermodynamic measure of intelligence defined as 'rare-valid lift' and argues that recursive self-simulation is necessary and nearly sufficient for high thermodynamic intelligence, making intelligence measurable on a universal scale.
Research paper shows that LLMs suffer from 'role confusion', where they prioritize the style of text over its actual role tags, enabling prompt injection attacks. Destyling text reduces attack success from 61% to 10%, indicating a fundamental challenge for LLM security.
An international research team identified the source of a mysterious repeating radio signal as a white dwarf pulling material from a companion red dwarf, solving a long-standing astronomical puzzle.
Self-Reset Policy Optimization (SRPO) addresses credit assignment in multi-step reasoning RL post-training by localizing the first wrong reasoning step and learning from counterfactual continuations without external supervision.
This paper presents a theory that prompt injection attacks on LLMs stem from a fundamental flaw in how models perceive roles, treating roles as a type system for language. It explains existing attacks, predicts new ones, and proposes a research agenda for a science of roles.
A reflection on the landmark 'Attention Is All You Need' paper, highlighting how removing recurrence and relying solely on attention mechanisms revolutionized AI and led to modern LLMs like GPT and Claude.
This paper revises the estimated proportion of newly written code that is generated or reviewed by AI, analyzing its impact on software development.