neural-networks

#neural-networks

Training Dynamics of Neural Software Defect Predictors under Coupled Data-Quality Issues

arXiv cs.LG ↗ · 19h ago Cached

This paper investigates how training dynamics of neural networks for software defect prediction are affected by coupled data-quality issues such as class imbalance and overlap, proposing an interaction-aware empirical protocol.

0 favorites 0 likes

#neural-networks

How Complexity Contributes to Learning Opacity in Machine Learning

arXiv cs.LG ↗ · 19h ago Cached

This paper analyzes why machine learning, particularly neural networks, remains opaque in its learning process by framing it as a complex dynamical system, identifying three key properties that contribute to learning opacity, and arguing that some sources may be irreducible.

0 favorites 0 likes

#neural-networks

@k_solidified_: https://arxiv.org/abs/2106.10165 All of humanity should read this

X AI KOLs Timeline ↗ · yesterday Cached

This book develops an effective theory for deep neural networks, showing that their predictions are nearly-Gaussian and governed by the depth-to-width ratio, and introduces representation group flow to analyze signal propagation and learning dynamics.

0 favorites 0 likes

#neural-networks

Puzzling Success of Overparameterization: Lottery Tickets or Escape Dimensions?

Hacker News Top ↗ · yesterday

A paper investigating the reasons behind the success of overparameterization in neural networks, comparing the lottery ticket hypothesis with escape dimensions.

0 favorites 0 likes

#neural-networks

@heynavtoor: 10 free resources that teach you more about AI in 30 days than a $15,000 bootcamp. Bookmark this list. 1. 3Blue1Brown G…

X AI KOLs Timeline ↗ · yesterday Cached

A curated list of 10 free AI learning resources including courses, newsletters, podcasts, and interactive books from experts like 3Blue1Brown, Andrej Karpathy, and Andrew Ng.

0 favorites 0 likes

#neural-networks

Are Safety Guarantees in Neural Networks Safe? How to Compute Trustworthy Robustness Certifications

arXiv cs.LG ↗ · yesterday Cached

This paper introduces the apothem measure for computing trustworthy robustness certifications in neural networks, proves intractability of volume-optimal certifications, and presents the ParallelepipedoNN system achieving twofold improvement in minimum edge length on MNIST and Fashion MNIST.

0 favorites 0 likes

#neural-networks

@N8Programs: Excited to announce an arXiv note on an interesting mathematical symmetry I noticed... that connects the classic MLP to…

X AI KOLs Timeline ↗ · 2d ago Cached

Announces an arXiv note on a mathematical symmetry connecting classic MLP to Gated MLP, going beyond empirical performance.

0 favorites 0 likes

#neural-networks

How do LLMs store so much knowledge? A look at feature superposition

Reddit r/ArtificialInteligence ↗ · 5d ago Cached

Explores how large language models compress vast knowledge into finite space using feature superposition, explaining the distinction between dimensions and features with biological analogies.

0 favorites 0 likes

#neural-networks

Neuron Populations Exhibit Divergent Selectivity with Scale [R]

Reddit r/MachineLearning ↗ · 2026-06-18

This paper introduces 'Rosetta Neurons'—universal neurons across diverse neural networks—and shows they scale as a sublinear power law, becoming more selective and monosemantic with scale, enabling data filtering that nearly matches oracle performance.

0 favorites 0 likes

#neural-networks

@aiwithmayank: THE BEST EXPLANATION OF HOW LLMS ACTUALLY WORK IS A FREE STANFORD LECTURE AND IT STARTS WITH A MOUSE EATING CHEESE it's…

X AI KOLs Timeline ↗ · 2026-06-18 Cached

A tweet promotes Stanford's free CS324 course on large language models, which uses a simple example of a mouse eating cheese to explain how LLMs work, and includes interactive demos.

0 favorites 0 likes

#neural-networks

Effects of sparsity and superposition on loss in simple autoencoders

arXiv cs.LG ↗ · 2026-06-18 Cached

This paper provides a mathematical analysis of superposition in neural networks, deriving upper and lower bounds on L2 reconstruction loss for simple autoencoders with power activation functions, corroborating empirical findings by Elhage et al.

0 favorites 0 likes

#neural-networks

A Link between Shock-wave Theory and Symmetry-reduced Stochastic Gradient Descent for Artificial Neural Networks

arXiv cs.LG ↗ · 2026-06-18 Cached

This paper establishes a mathematically rigorous connection between shock-wave theory and symmetry-quotiented learning dynamics of stochastic gradient descent, showing that after symmetry reduction and coarse-graining, the dynamics satisfy viscous Hamilton-Jacobi and Burgers-type equations with shock formation times controlled by loss curvature.

0 favorites 0 likes

#neural-networks

In game theory, generalists sometimes win out over specialists

MIT News — Artificial Intelligence ↗ · 2026-06-17 Cached

MIT researchers co-authored a paper showing that general-purpose policy gradient algorithms can outperform specialized game-theoretic algorithms in imperfect-information games, challenging long-held assumptions in the field.

0 favorites 0 likes

#neural-networks

Continuous-time Optimal Stopping through Deep Reinforcement Learning

arXiv cs.LG ↗ · 2026-06-17 Cached

This paper introduces CARLOS, a deep reinforcement learning algorithm that learns continuous-time optimal stopping rules for American-style options using an aggregate deep neural network, effectively closing the Bermudan-American value gap with high computational efficiency.

0 favorites 0 likes

#neural-networks

@teropa: I which @sedielem beautifully illustrates why diffusion models work so well with images Our visual world is spatially c…

X AI KOLs Following ↗ · 2026-06-16 Cached

An explanation of why diffusion models work well for images: low-frequency spectral components dominate, so denoising recovers coarse structure first, then fine detail — analogous to spectral autoregression.

0 favorites 0 likes

#neural-networks

AI Engram: In Search of Memory Traces in Artificial Intelligence

arXiv cs.AI ↗ · 2026-06-16 Cached

Introduces a geometric framework to identify 'AI engrams' – memory traces in deep neural networks – formalizing neuroscientific criteria into a closed-form estimator, enabling surgical memory manipulation in models from MLPs to LLMs.

0 favorites 0 likes

#neural-networks

Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model

arXiv cs.LG ↗ · 2026-06-16 Cached

This paper demonstrates that two-layer neural networks trained with gradient-based methods can achieve the optimal computational-statistical tradeoff for learning Gaussian single-index models, matching the SQ lower bound up to polylogarithmic factors for all generative exponents and extending to sparse settings with a novel weight perturbation technique.

0 favorites 0 likes

#neural-networks

GRAPE: Guided Parameter-Space Evolution for Compact Adversarial Robustness

arXiv cs.LG ↗ · 2026-06-16 Cached

GRAPE is a training framework that progressively exposes parameter space during adversarial training, achieving higher robust accuracy with fewer parameters compared to fixed-structure methods on CIFAR-10.

0 favorites 0 likes

#neural-networks

@che_shr_cat: 1/ Standard transformers have a fundamental topological flaw: they cannot track dynamic states over time without runnin…

X AI KOLs Timeline ↗ · 2026-06-15 Cached

This thread argues that standard transformers have a topological flaw: once a state representation reaches the top layer, they cannot update beliefs over time, causing collapse as depth increases.

0 favorites 0 likes

#neural-networks

@BetaTomorrow: https://x.com/BetaTomorrow/status/2066435380623385000

X AI KOLs Timeline ↗ · 2026-06-15 Cached

This thread discusses the concept of 'Jagged Intelligence' in AI, framing it as a consequence of AI learning being an ill-posed inverse problem, and argues that external stabilizers like scaffolding and verification are essential.

0 favorites 0 likes

neural-networks

Submit Feedback