mamba

#mamba

Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design

Hugging Face Daily Papers ↗ · 3d ago Cached

This paper introduces AIRA-Compose and AIRA-Design, dual frameworks using AI agents to autonomously discover neural architectures that outperform standard Transformers and scale efficiently.

0 favorites 0 likes

#mamba

Prediction Bottlenecks Don't Discover Causal Structure (But Here's What They Actually Do)

Hugging Face Daily Papers ↗ · 2026-05-09 Cached

This paper challenges the claim that prediction bottlenecks in models like Mamba recover causal structure, demonstrating through a new benchmark that gains are largely due to confounds and robustness artifacts rather than true causal discovery.

0 favorites 0 likes

#mamba

@no_stp_on_snek: first receipts: triattention v3 evicts safely with longctx. ✓HIT every rung 32k → 256k on qwen3.5-2b-4bit (hybrid mamba…

X AI KOLs Following ↗ · 2026-05-08

Introduces triattention v3, a new attention mechanism that enables safe eviction without recall loss for long-context inference, demonstrated on a hybrid mamba+attention model up to 256k tokens.

0 favorites 0 likes

mamba

Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design

Prediction Bottlenecks Don't Discover Causal Structure (But Here's What They Actually Do)

@no_stp_on_snek: first receipts: triattention v3 evicts safely with longctx. ✓HIT every rung 32k → 256k on qwen3.5-2b-4bit (hybrid mamba…

Submit Feedback