Tag
The author shares pitfalls from building a shared decision log for AI agent teams, including race conditions exposed by faster models, unreliable contradiction detection with cosine similarity, and challenges in testing multi-agent promises.
This paper demonstrates that cosine similarity is a poor proxy for assessing layer importance in LLMs, and proposes using the actual accuracy drop from layer removal as a more robust metric.
This paper demonstrates that mean-pooled cosine similarity is not length-invariant under anisotropic representations, showing it artificially inflates similarity with sequence length. It argues for using Centered Kernel Alignment (CKA) as a default metric to correct biases in cross-lingual and cross-representation analysis.
A model on Replicate that outputs CLIP ViT-L/14 features for text and images, allowing similarity computation between inputs.