neural-networks

#neural-networks

Learning from almost nothing: How neural networks survive heavy input corruption

arXiv cs.LG ↗ · 2026-06-11 Cached

This paper investigates how neural networks maintain high accuracy even when over 90% of input features are corrupted, deriving a centroid-based decision rule in the high-noise limit using a mean-field approach.

0 favorites 0 likes

#neural-networks

Mechanical Field Networks: Structured Neural Dynamics for Multivariate Systems

arXiv cs.LG ↗ · 2026-06-11 Cached

This paper introduces MF-Net, a recurrent dynamical model that represents multivariate systems through a shared field state and learns a mechanical transition for joint evolution. It achieves competitive forecasting while enabling interpretable structural readout of learned relations.

0 favorites 0 likes

#neural-networks

Deficient executive control in transformer attention

Hacker News Top ↗ · 2026-06-10

The article discusses a deficiency in executive control within transformer attention mechanisms, highlighting limitations in how transformers manage sequential dependencies.

0 favorites 0 likes

#neural-networks

@kmeanskaran: Best way to balance both ML and AI today is: > Python (specially Pydantic) > Neural Networks fundamentals > RNN, LSTM, …

X AI KOLs Timeline ↗ · 2026-06-10 Cached

A tweet by Karan (@kmeanskaran) outlining a learning roadmap for balancing ML and AI, covering Python, neural networks, NLP, LLMs, deployment, and agentic AI, with a reply from Amit seeking beginner guidance.

0 favorites 0 likes

#neural-networks

Emergence via Phase Transitions: Mechanism Landscapes and Universal Convergence Across Complex Systems

arXiv cs.LG ↗ · 2026-06-09 Cached

This paper introduces the Hierarchical Emergence Framework (HEF), which explains how diverse systems such as neural networks and biological evolution converge to similar internal representations through phase transitions in mechanism landscapes under physical and informational constraints. The framework is validated empirically with 111 grokking experiments that confirm universal convergence and identify a critical energy threshold.

0 favorites 0 likes

#neural-networks

Flatland: The Adventures of Gradient Descent with Large Step Sizes

arXiv cs.LG ↗ · 2026-06-08 Cached

This paper addresses the open question of maximum step size for gradient descent convergence on non-L-smooth objectives, introducing adaptive methods that operate at the edge of stability and can minimize sharpness globally.

0 favorites 0 likes

#neural-networks

P-Cast Precision in FP8 Attention: Sink-Induced Collapse and the Optimality of S=2^8

arXiv cs.AI ↗ · 2026-06-08 Cached

This paper analyzes precision loss in FP8 attention due to the attention sink phenomenon when casting the softmax output to FP8 (E4M3). It shows that forward KV iteration causes underflow of non-sink attention values, and proposes reverse iteration and a static scaling factor S=256 to eliminate underflow, achieving 3-10x MSE improvement.

0 favorites 0 likes

#neural-networks

@jakevin7: Everyone is talking about AI now, but few know that the founder of this field was once dismissed as a madman by the world. Geoffrey Hinton won the Nobel Prize in Physics in 2024. A reporter asked him: How many years did you wait? He said: About forty. In 1969, a book killed neural networks...

X AI KOLs Following ↗ · 2026-06-08 Cached

This article recounts how Geoffrey Hinton persisted in his research for three decades during the AI winter, when neural networks were abandoned by academia. He eventually gained fame with AlexNet in the 2012 ImageNet competition and won the Nobel Prize in Physics in 2024.

0 favorites 0 likes

#neural-networks

@zhaisf: These were some magical results from distillation by @geoffreyhinton that really shocked me when I first saw them, and …

X AI KOLs Following ↗ · 2026-06-07 Cached

The article discusses surprising robustness of model distillation with respect to training distribution, even with little overlap with target distribution, and its implications for on/off-policy distillation.

0 favorites 0 likes

#neural-networks

@incrementaliser: Just finished watching a gem by @ChrisGPotts , "Finding linguistic structure in large language models", and I'm now pro…

X AI KOLs Following ↗ · 2026-06-06

A tweet highlights Chris Potts' talk on how large language models learn linguistic structures, reinforcing the view that LLMs capture syntax and semantics.

0 favorites 0 likes

#neural-networks

Transformers Are Inherently Succinct

Hacker News Top ↗ · 2026-06-05 Cached

This paper argues that transformer architectures are inherently succinct, meaning they can represent certain functions more efficiently than other models. It presents theoretical analysis and proofs.

0 favorites 0 likes

#neural-networks

Playing with Vision Embeddings

Hacker News Top ↗ · 2026-06-05 Cached

This post explores DINOv3 vision embeddings by generating images that correspond to specific embedding directions, using gradient optimization and augmentation strategies to invert the model.

0 favorites 0 likes

#neural-networks

Derivative Informed Learning of Exchange-Correlation Functionals

arXiv cs.LG ↗ · 2026-06-04 Cached

This ICML 2026 paper introduces Derivative Informed XC-Loss (DI-Loss), a training approach for machine-learned exchange-correlation functionals that incorporates first and second derivative supervision on the Grassmannian of density matrices. Across four architectures, DI-Loss reduces total-energy MAE by 66% compared to energy and density supervision alone, and improves excited-state predictions in TDDFT calculations.

0 favorites 0 likes

#neural-networks

From Ticks to Flows: Dynamics of Neural Reinforcement Learning in Continuous Environments

arXiv cs.LG ↗ · 2026-06-04 Cached

This paper presents a theoretical framework for deep reinforcement learning in continuous environments, modeling it as a continuous-time stochastic process using stochastic control theory. The authors characterize an actor-critic algorithm's dynamics in the infinite width limit of two-layer networks, deriving an equation for infinitesimal changes in state distribution under a vanishingly small learning rate.

0 favorites 0 likes

#neural-networks

AI from concrete to abstract: demystifying artificial intelligence to the general public

arXiv cs.AI ↗ · 2026-06-04 Cached

This paper presents AIcon2abs, a methodology combining visual programming and WiSARD weightless neural networks to help general audiences, including children, understand AI concepts through hands-on learning activities. The approach integrates training and classification as first-class programming constructs to make the distinction between learning machines and conventional programs more intuitive.

0 favorites 0 likes

#neural-networks

"They're made out of weights"

Hacker News Top ↗ · 2026-06-03 Cached

A creative dialogue explores the idea that large language models are fundamentally just matrices of weights, challenging notions of understanding and sentience.

0 favorites 0 likes

#neural-networks

Curatube: a distraction free interface for YT playlists to focus on learning

Lobsters Hottest ↗ · 2026-06-03 Cached

Curatube is a distraction-free interface for YouTube playlists, designed to help focus on learning. It currently features the Neural Networks: Zero to Hero course by Andrej Karpathy.

0 favorites 0 likes

#neural-networks

Neural Networks Provably Learn Spectral Representations for Group Composition

arXiv cs.LG ↗ · 2026-06-03 Cached

This paper theoretically demonstrates that two-layer neural networks trained on group composition tasks learn spectral representations, with neurons converging to irreducible representations and achieving rotational rank-one alignment, providing a representation-theoretic account of feature learning.

0 favorites 0 likes

#neural-networks

Spectral Asymptotics of Neural Network Loss Landscapes: An Exact Decomposition of the Curvature Exponent

arXiv cs.LG ↗ · 2026-06-03 Cached

This paper presents an exact decomposition of the curvature exponent α in neural network loss landscapes, explaining why it varies across layer types. It introduces the spectral alignment decomposition and derives a spectral transfer identity linking curvature, gradient rank decay, and Hessian exponents, validated across architectures and datasets.

0 favorites 0 likes

#neural-networks

Neural Networks Provably Learn Spectral Representations for Group Composition

Hugging Face Daily Papers ↗ · 2026-06-02

This paper provides a theoretical analysis of how neural networks learn structured representations during group composition tasks, proving that training dynamics drive neurons to converge to irreducible group representations with exponential convergence rates. The work establishes a representation-theoretic account of feature learning and characterizes a low-rank compression phenomenon for matrix-valued group representations.

0 favorites 0 likes

neural-networks

Submit Feedback