theoretical-ai

#theoretical-ai

RT-Transformer: The Transformer Block as a Spherical State Estimator

arXiv cs.LG ↗ · 14h ago Cached

This paper presents a theoretical framework interpreting Transformer components (attention, residual connections, normalization) as arising from a spherical state estimation problem using Radial-Tangential SDEs.

0 favorites 0 likes

#theoretical-ai

Why Retrying Fails: Context Contamination in LLM Agent Pipelines

arXiv cs.AI ↗ · yesterday Cached

This paper introduces the Context-Contaminated Restart Model (CCRM) to formally analyze how failed attempts in LLM agent pipelines contaminate context and increase error rates during retries. It provides theoretical proofs and validates the model against SWE-bench data, showing significant discrepancies with standard independent models.

0 favorites 0 likes

#theoretical-ai

A PyTorch Library of Turing-Complete Neural Networks

arXiv cs.LG ↗ · yesterday Cached

A PyTorch library that compiles neural networks from Turing machine descriptions, enabling exact simulation without training.

0 favorites 0 likes

#theoretical-ai

Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking

arXiv cs.LG ↗ · yesterday Cached

This empirical study validates theoretical findings on feature repulsion and spectral lock-in during the grokking phenomenon in two-layer neural networks, demonstrating how activation functions influence the transition from memorization to generalization.

0 favorites 0 likes

#theoretical-ai

Conditional Diffusion Under Linear Constraints: Langevin Mixing and Information-Theoretic Guarantees

arXiv cs.LG ↗ · 5d ago Cached

This paper analyzes zero-shot conditional sampling with pretrained diffusion models for linear inverse problems, providing information-theoretic guarantees and proposing a projected-Langevin initialization method.

0 favorites 0 likes

theoretical-ai

RT-Transformer: The Transformer Block as a Spherical State Estimator

Why Retrying Fails: Context Contamination in LLM Agent Pipelines

A PyTorch Library of Turing-Complete Neural Networks

Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking

Conditional Diffusion Under Linear Constraints: Langevin Mixing and Information-Theoretic Guarantees

Submit Feedback