numeric-precision

Tag

Cards List
#numeric-precision

Revisiting Padded Transformer Expressivity: Which Architectural Choices Matter and Which Don't

arXiv cs.LG · 6d ago Cached

This theoretical paper analyzes the expressivity of padded transformers, showing that attention type, width, and uniformity have little impact compared to numeric precision and model depth. It establishes equivalences between transformer variants and circuit complexity classes like AC0 and TC0, providing a robust characterization.

0 favorites 0 likes
← Back to home

Submit Feedback