high-frequency-tokens

Tag

Cards List
#high-frequency-tokens

Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings

Hugging Face Daily Papers · 2026-06-05 Cached

The paper identifies that LLM text embeddings overly express high-frequency uninformative tokens and proposes EmbedFilter, a linear transformation that filters out this subspace to improve semantic representations and enable dimensionality reduction.

0 favorites 0 likes
← Back to home

Submit Feedback