Tag
This paper uses sparse autoencoders to decompose LLMs into interpretable features and shows that semantic features explain brain alignment with cortical semantic topography, generalizing across English, Chinese, and French.
This paper investigates brain-LLM alignment across English, Chinese, and French using fMRI data and multiple LLMs, finding that training-language dominance and typological distance, not an inherent English advantage, drive alignment patterns.