neural-architecture

#neural-architecture

@_albertgu: Introducing a new sequence model Raven which pushes the boundary of fixed-state-size sequence models! Raven bridges pop…

X AI KOLs Timeline ↗ · 2d ago

Researchers introduce Raven, a novel sequence model that merges state space model efficiency with a selective slot-updating mechanism inspired by sliding window attention to improve long-context retrieval. The approach offers a more principled alternative to existing linear-time models.

0 favorites 0 likes

#neural-architecture

He presentado CTNet: una arquitectura donde el cómputo ocurre como evolución de un estado persistente [D]

Reddit r/MachineLearning ↗ · 2026-04-23

CTNet introduces a novel neural architecture where computation is framed as the evolution of a persistent state rather than successive rewrites, incorporating re-entrant memory, multi-scale coherence, and projective output.

0 favorites 0 likes

#neural-architecture

Generative modeling with sparse transformers

OpenAI Blog ↗ · 2019-04-23 Cached

OpenAI introduces the Sparse Transformer, a deep neural network that improves the attention mechanism from O(N²) to O(N√N) complexity, enabling modeling of sequences 30x longer than previously possible across text, images, and audio. The model uses sparse attention patterns and checkpoint-based memory optimization to train networks up to 128 layers deep, achieving state-of-the-art performance across multiple domains.

0 favorites 0 likes

neural-architecture

@_albertgu: Introducing a new sequence model Raven which pushes the boundary of fixed-state-size sequence models! Raven bridges pop…

He presentado CTNet: una arquitectura donde el cómputo ocurre como evolución de un estado persistente [D]

Generative modeling with sparse transformers

Submit Feedback