Tag
A thread from Ai2 compares transformer (Olmo 3) and hybrid (Olmo Hybrid) models, finding that transformers excel at copying while RNNs better model meaning-bearing words, highlighting the growing viability of hybrid architectures.