frequency-bias

#frequency-bias

@vintcessun: Turns out LLM text embeddings are hijacked by high-frequency tokens (periods, articles)! The unembedding matrix implicitly defines a low-rank subspace dominated by these uninformative expressions. This is the root cause of LLMs' poor performance as universal embeddings, and the contamination is subtle. EmbedFilter…

X AI KOLs Timeline ↗ · 3d ago Cached

This study reveals that LLM text embeddings are hijacked by high-frequency tokens (e.g., periods, articles) and proposes EmbedFilter, which performs SVD on the unembedding matrix and subtracts the projection component to release true semantics, achieving zero-training-cost dimensionality reduction and retrieval efficiency gains.

0 favorites 0 likes

#frequency-bias

Frequency Bias and OOD Generalization in Neural Operators under a Variable-Coefficient Wave Equation

Hugging Face Daily Papers ↗ · 2026-05-13 Cached

This paper investigates the generalization behavior of Fourier Neural Operators and Deep Operator Networks under distribution shifts in a variable-coefficient wave equation, revealing that FNO struggles with high-frequency inputs while DeepONet shows milder degradation.

0 favorites 0 likes

frequency-bias

Frequency Bias and OOD Generalization in Neural Operators under a Variable-Coefficient Wave Equation

Submit Feedback