Tag
Analyzes the 64-D embedding manifold of Google AlphaEarth across 12.1M U.S. samples, shows non-Euclidean structure and poor vector arithmetic, then builds an agentic system with geometry-aware tools that outperforms parametric baselines on environmental queries.
Systematic study shows LLM-based dense retrievers outperform BERT baselines on typos and poisoning but remain vulnerable to semantic perturbations, with embedding geometry predicting robustness.