mnist

#mnist

Bit-Mass Theory – The Container Principle

Reddit r/artificial ↗ · 2026-05-31

The Bit-Mass Theory proposes that the total number of weight bits determines model accuracy, not the computation format, with experiments on MNIST showing equivalent performance between binary and floating-point networks at the same bit-mass.

0 favorites 0 likes

#mnist

Are Flat Minima an Illusion?

arXiv cs.LG ↗ · 2026-05-08 Cached

This paper challenges the common belief that flat minima cause better generalization in neural networks, arguing that 'weakness'—a reparameterization-invariant measure of function simplicity—is the true driver. Empirical results on MNIST and Fashion-MNIST show that weakness predicts generalization while sharpness anticorrelates, and the large-batch generalization advantage vanishes as training data increases.

0 favorites 0 likes

mnist

Bit-Mass Theory – The Container Principle

Are Flat Minima an Illusion?

Submit Feedback