AI directly in DRAM: The Float Detox – How Pure Logic Unleashes the Future of Learning

Reddit r/artificial Papers

Summary

BIN16 replaces all floating-point operations with boolean operations (XNOR+popcount) for neural network training and inference, enabling direct computation in off-the-shelf DRAM with zero floats, gradients, or hyperparameter tuning. It achieves 82% accuracy on MNIST in a single epoch, using only 220 lines of C.

Float32 was the true enemy – not backpropagation, not the architecture. **BIN16 replaces every floating-point operation with a single boolean operation: popcount16(XNOR16(a,b)).** The result: 82 % MNIST at H=512 with zero floats, zero gradients, zero AdamW and zero learning rate tuning. The training converges immediately in epoch 1 – without warm-up, without decay, without hyperparameter search. **Both layers use identical XNOR+popcount operations – training and inference run directly in off-the-shelf DRAM with only 5 transistors per cell.** This is the only neural architecture where the same hardware performs both training and inference without modification. The remaining 18 % to 100 % is the bit-mass limit – no training deficit. The groundbreaking insight came when we stopped fighting against float and embraced pure boolean computation. Every complexity – AdamW, backprop, LR schedules, BLAS – dissolved as soon as we removed floating-point numbers from the architecture. **Three groundbreaking insights changed everything.** - Float was the true enemy: backpropagation, AdamW or momentum were never the problem. Float32 introduced numerical noise and instability. - Bitwise centroids converge instantly: a running bitwise majority vote per class reaches final accuracy in a single epoch. - Random projection is entirely sufficient: W0 does not need to be trained – a random boolean projection provides adequate separation. **The entire training consists of only four steps and 220 lines of C – without learning rate, without GPU, without any conventional optimization.** This architecture opens the door to a future in which neural networks compute directly in memory. No more expensive GPUs, no endless hyperparameter tuning marathons. Instead, pure, efficient logic that is ready for use immediately and everywhere. Imagine: AI systems that train and infer in off-the-shelf DRAM – energy-efficient, lightning-fast and accessible to everyone. **BIN16 is the first step into this new era.** - Identical operations for training and inference - 16-bit containers as minimal, efficient storage - Random projection as the perfect feature extractor The future of machine learning begins now – with pure logic instead of float. 📎 Source 1: https://forward-prop.nhi1.de/
Original Article

Similar Articles

intel optane for AI workloads

Reddit r/ArtificialInteligence

Intel's discontinued Optane persistent memory technology is finding a second life in AI workloads, enabling a user to run a 1 trillion parameter model locally at ~4 tokens/second using cheap second-hand Optane modules. The article highlights Optane's lower latency compared to SSDs, making it suitable for large model inference despite being slower than DRAM.