causal-steering

Tag

Cards List
#causal-steering

@Propriocetive: New preprint: Mathematics is All You Need 2 — Sign-Stabilized Behavioral Fibers in Transformer Residual Streams. Headli…

X AI KOLs Following · 2026-05-10

A new preprint titled 'Mathematics is All You Need 2' presents the 'Two-Channel theorem,' demonstrating that behavioral fibers in transformer residual streams are sign-stabilized and causally steerable across different architectures (Qwen to Llama). The study claims high reproducibility and shows that the behavioral substrate is near-one-dimensional, separating generation from latent structure.

1 favorites 1 likes
#causal-steering

SLAM: Structural Linguistic Activation Marking for Language Models

arXiv cs.CL · 2026-05-08 Cached

SLAM is a novel white-box watermarking scheme that embeds marks into the structural geometry of LLM residual streams using sparse autoencoders, achieving 100% detection accuracy with minimal quality loss on Gemma-2 models, avoiding the token-distribution biasing of prior methods.

0 favorites 1 likes
← Back to home

Submit Feedback