Tag
This post reports an observation that reading a long, structured text before answering alters a model's later responses, with behavioral evidence from Claude and mechanistic analysis on open-weight Gemma models showing separable hidden states and sharper probability distributions in instruction-tuned variants.