Tag
A developer reverse-engineered Apple's private APIs to enable training neural networks directly on the Apple Neural Engine (ANE) in M4 Macs and iPhones, bypassing CoreML and GPU. The project demonstrates that ANE hardware is capable of training, though with limitations like low utilization and CPU fallbacks for some operations.
Introduces KV-Compression Aware Training (KV-CAT), a method that encourages transformers to learn compressible key-value caches during training, improving memory efficiency for long-context tasks without sacrificing performance.