Papers

Cards List

Show HN: Neural Particle Automata

Hacker News Top · 7h ago Cached

Introduces Neural Particle Automata, a method for learning self-organizing particle dynamics using smooth particle hydrodynamics perception, enabling particles to have local perception vectors for an update rule, analogous to Neural Cellular Automata but on continuous particle positions.

0 favorites 0 likes

AI Built a Nuke and Still Lost

Hacker News Top · 7h ago Cached

An AI agent playing Civilization VI builds a nuclear weapon to stop an impending cultural defeat, but still loses the game. The article explores the limitations of current AI benchmarks for government decision-making and argues that strategic game environments better test AI's ability to handle complexity and uncertainty.

0 favorites 0 likes

What a model reads beforehand changes how it answers later - and you can see it in the hidden states

Reddit r/artificial · 9h ago

This post reports an observation that reading a long, structured text before answering alters a model's later responses, with behavioral evidence from Claude and mechanistic analysis on open-weight Gemma models showing separable hidden states and sharper probability distributions in instruction-tuned variants.

0 favorites 0 likes

What you read before a question changes how a language model answers it — even when the question has nothing to do with what you read. Potential Alignment Vulnerability in LLMs: Behavioral and Hidden-State Evidence from Gemma-3-12B

Reddit r/ArtificialInteligence · 9h ago

The article reports a potential alignment vulnerability in LLMs where processing a structured passage before an unrelated question can alter the model's response, with mechanistic evidence from Gemma-3-12B showing hidden-state separation.

0 favorites 0 likes

VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO

Hacker News Top · 13h ago Cached

This technical report introduces VibeThinker-3B, a 3B parameter dense model that achieves frontier-level reasoning performance on benchmarks like AIME26 and LiveCodeBench, matching or exceeding much larger models such as DeepSeek V3.2 and GLM-5 through a combination of curriculum-based SFT, multi-domain RL, and offline self-distillation.

0 favorites 0 likes

Thermodynamic Measure Of Intelligence

Reddit r/singularity · 14h ago Cached

This paper proposes a thermodynamic measure of intelligence defined as 'rare-valid lift' and argues that recursive self-simulation is necessary and nearly sufficient for high thermodynamic intelligence, making intelligence measurable on a universal scale.

0 favorites 0 likes

Prompt Injection as Role Confusion

Simon Willison's Blog · 15h ago Cached

Research paper shows that LLMs suffer from 'role confusion', where they prioritize the style of text over its actual role tags, enabling prompt injection attacks. Destyling text reduces attack success from 61% to 10%, indicating a fundamental challenge for LLM security.

0 favorites 0 likes

A Source of Mysterious Repeating Radio Signals From Space Has Been Identified

Wired · 17h ago Cached

An international research team identified the source of a mysterious repeating radio signal as a white dwarf pulling material from a companion red dwarf, solving a long-standing astronomical puzzle.

0 favorites 0 likes

Prompt Injection as Role Confusion

Hacker News Top · 23h ago Cached

This paper presents a theory that prompt injection attacks on LLMs stem from a fundamental flaw in how models perceive roles, treating roles as a type system for language. It explains existing attacks, predicts new ones, and proposes a research agenda for a science of roles.

0 favorites 0 likes

Attention Is All You Need

Reddit r/ArtificialInteligence · yesterday

A reflection on the landmark 'Attention Is All You Need' paper, highlighting how removing recurrence and relying solely on attention mechanisms revolutionized AI and led to modern LLMs like GPT and Claude.

0 favorites 0 likes

Revised: Estimated share of newly written code exposed to AI generation and review

Reddit r/singularity · yesterday

This paper revises the estimated proportion of newly written code that is generated or reviewed by AI, analyzing its impact on software development.

0 favorites 0 likes

@AlphaSignalAI: https://x.com/AlphaSignalAI/status/2069064122218717387

X AI KOLs Timeline · yesterday Cached

This article explores how AI agents can automatically write and optimize their skill files using techniques like SkillOpt from Microsoft Research, which treats skill documents as trainable state and delivers significant performance improvements. It addresses the challenge of manual skill tuning and presents frameworks like GEPA and EvoSkill as evolutionary approaches.

0 favorites 0 likes

@amitiitbhu: https://x.com/amitiitbhu/status/2069023290182758497

X AI KOLs Timeline · yesterday Cached

A detailed blog post explaining the Sakana Fugu technical report, which introduces orchestrator AI models that route tasks to specialized models, achieving collective intelligence.

0 favorites 0 likes

@arxivblog: Computational complexity theorists show gravity must be quantised https://arxivblog.substack.com/p/computational-comple…

X AI KOLs Timeline · yesterday Cached

Computational complexity theorists argue that semiclassical gravity's non-linear dynamics would enable impossibly powerful computation, proving gravity must be quantized. The paper uses the Schrödinger-Newton equation to show that classical gravity coupled to quantum matter leads to computational contradictions.

0 favorites 0 likes

NVIDIA's AI agents taught robots to install GPUs into motherboards without any human help

Reddit r/singularity · yesterday Cached

NVIDIA's ENPIRE framework, developed with CMU and UC Berkeley, uses AI coding agents to autonomously train robots for high-precision physical tasks like GPU installation, achieving a 99% success rate through a closed feedback loop and real hardware trials.

0 favorites 0 likes

@doctorveera: Pakistan Genomic Resource (PGR), founded by Danish Saleheen, is the world's largest genetic database of human knockouts…

X AI KOLs Timeline · yesterday Cached

A new Nature paper from the Pakistan Genomic Resource (PGR) analyzes 173,303 Pakistanis from consanguineous communities, identifying human knockouts for nearly one-third of protein-coding genes, overturning biological assumptions like PRDM9 essentiality for fertility.

0 favorites 0 likes

@rohanpaul_ai: Can LLM agents actually discover hidden rules by interacting? The answer is uncomfortable. The more complicated the hid…

X AI KOLs Following · yesterday Cached

This paper investigates whether LLM agents can infer hidden world models through interaction, finding that they struggle to build stable internal models as complexity increases.

0 favorites 0 likes

@VukRosic99: Test Time Reinforcement Learning 1. Take an unlabeled question 2. Sample many answers from the LLM 3. Majority vote → t…

X AI KOLs Timeline · yesterday Cached

Introduces Test-Time Reinforcement Learning (TTRL), a method that uses majority voting on unlabeled data to create pseudo-labels for RL training, enabling self-improvement of LLMs without ground-truth answers. Achieves significant gains (e.g., +159-211% on AIME 2024 for Qwen-2.5-Math-7B).

1 favorites 1 likes

PivCo-Huffman “merge” operations

Lobsters Hottest · yesterday Cached

This blog post analyzes the PivCo-Huffman paper, which introduces 'merge' operations for parallel Huffman decoding, enabling efficient vectorized and GPU-friendly decoding without interleaving overhead.

0 favorites 0 likes

Arbor: Explicit Geometric Conditioning for Controllable 3D Asset Generation

Hugging Face Daily Papers · yesterday Cached

Arbor introduces explicit geometric control for 3D asset generation by using constraint meshes (hull, avoidance, touch regions) to condition latent generation, improving spatial constraint adherence without sacrificing object quality.

0 favorites 0 likes
Next →
← Back to home

Submit Feedback