Tag
This paper investigates principles of concept representation in sentence encoders through the lens of compositional semantics, identifying four key principles: fine-tuning recalibrates latent geometry, semantic signal concentrates in the final layer, hard negatives improve discrimination but not ranking, and supervision effectiveness depends on composition type.
Dwarkesh Patel tweets about Sergey Levine's argument that emergent capabilities in LLMs arise from compositionality, not just from training data.
This paper reframes model collapse in LLMs as a cultural transmission phenomenon, showing that iterated learning theory predicts a non-monotonic trajectory of compositionality under self-training, confirmed across multiple languages and models.
This paper proposes a Polar Probe that linearly recovers semantic structures from LLM activations by representing entity relations through distance and direction in a learned subspace. Testing across arithmetic, visual scenes, family trees, metro maps, and social interactions shows the code emerges in middle layers, generalizes to new entities, and causally influences model predictions.