Tag
This paper presents CA-NKCF, a novel distributed latent state estimator combining partial domain knowledge with deep neural networks, achieving robust performance without noise statistics knowledge, outperforming traditional filters in linear, chaotic, and wireless tracking environments.
This paper introduces a unified decision-theoretic pretraining framework for neural network-based time series estimators, trained on stratified simulations to approximate near-optimal decision rules. Experiments show that the resulting estimators outperform traditional methods like maximum likelihood estimation on both synthetic and real-world benchmarks.
A tweet promoting a guide on building a neural network from scratch in C++, aimed at providing a clear and practical explanation of how neural networks work.
New research indicates that the apparent 'global convergence' in scaled AI models is actually a statistical illusion caused by selection bias in model width and depth, and disappears once calibrated.
Garry Tan shares the complete architecture of GBrain, an AI model.
A set of four cards covering the core concepts of neural networks: neuron, forward pass, activations, and backpropagation, aimed at helping learners understand how models from perceptrons to transformers work.
A Microsoft AI researcher built a simple neural network using goats in Age of Empires II to argue that if such a system can be considered conscious, then claims of AI consciousness in chatbots are equally absurd.
Proposes RGNet, a neural network architecture based on renormalization group theory for hierarchical coarse-graining of feature space to address class imbalance and noise in fault diagnosis. Experimental results on the AI4I dataset show RGNet provides interpretable and competitive performance.
Microcrad reimplements Karpathy's micrograd autograd engine in C, providing an educational scalar-valued automatic differentiation library with reference counting and a small neural network, aimed at understanding backpropagation at the scalar level.
The author describes implementing a biologically plausible neural network training algorithm proposed by Geoffrey Hinton.
This blog post introduces Magnitude-Direction (MD) Decoupling, a method that separates neural network weight matrices into direction and magnitude components optimized with separate learning rates. Experiments show improved performance across Adam and Muon optimizers, automatic learning rate transfer across model widths, and scaling benefits in large Mixture-of-Experts models.
This paper introduces LANTERN, a neural network framework for estimating health-state transition probabilities from irregular longitudinal data, with applications to long-term care insurance. It outperforms traditional methods in discrimination and calibration for severe disability and mortality prediction.
A Chinese trader, while showcasing an AI neural network visualization on TikTok, accidentally exposed his cryptocurrency trading wallet used for arbitrage. The wallet had profited $367,000 in the past 30 days, triggering widespread tracking and discussion.
This paper proposes a novel framework that uses LLMs to extract analytical physics priors from scientific literature and distills them into a lightweight neural network for high-accuracy, real-time manufacturing process-property prediction, even with limited data.
A perceptron is the simplest neural network building block. This tutorial implements one from scratch in Python, explaining weights, bias, and learning through a clear example.
MeshWeaver presents an autoregressive mesh generation framework that directly predicts vertices using a multi-level sparse-voxel encoder, achieving state-of-the-art compression and geometric fidelity for high-poly meshes.
Introduces Amortized Factor Inference Networks (AFINs), a family of encode-merge-decode inference networks that generalize across varying priors, likelihoods, and dimensionality, achieving posterior accuracy comparable to NUTS with much less compute.
Thermocompute is a PyTorch emulator for thermodynamic probabilistic computing that enables neural network layers to achieve constant modeled physical time inference by exploiting parallel thermodynamic substrate, with immediate GPU-usable stochastic layers.
AutoMCU is a multi-agent system leveraging LLMs to automate neural network design for microcontroller units, significantly reducing customization time while ensuring feasibility under hardware constraints.
VirtualPC is an open-source 8-bit computer simulator that can train small neural networks from assembly code, demonstrating machine learning at the bare-metal level.