efficient-retrieval

#efficient-retrieval

Do All Visual Tokens Matter Equally? Object-Evidence Preserving Token Merging for Vision-Language Retrieval

Hugging Face Daily Papers ↗ · 2026-07-06 Cached

Introduces SaMer, an object-aware token merging framework that compresses image-side tokens for vision-language retrieval while preserving object-level evidence, achieving significant storage reduction and improved retrieval performance.

0 favorites 0 likes

#efficient-retrieval

@cosimorulli1: Happy to share that our recent work, TACHIOM, got integrated into the PyLate ecosystem! https://arxiv.org/pdf/2604.2814…

X AI KOLs Following ↗ · 2026-06-11 Cached

TACHIOM, a multivector retrieval system with token-aware clustering and hierarchical indexing, has been integrated into the PyLate ecosystem. It achieves up to 247x faster clustering and 9.8x retrieval speedup over state-of-the-art systems while maintaining comparable effectiveness.

0 favorites 0 likes

#efficient-retrieval

@vintcessun: Compressing 10 million vectors from 31GB to 4GB, with search even faster than FAISS — sounds crazy, but Turbovec actually did it. The core is Google's TurboQuant data-independent quantization: no training, no parameter tuning, just add vectors and index. Handwritten NEON/AVX-512 implementations are genuinely 12-20% faster, supporting filtered search by ID, saving a ton of post-processing hassle. Rust under the hood + pip install, minimal maintenance cost.

X AI KOLs Timeline ↗ · 2026-06-07 Cached

Turbovec, based on Google's TurboQuant algorithm, compresses 10 million vectors from 31GB to 4GB, with search speed 12-20% faster than FAISS, supports filtered search, and offers a Rust implementation with a Python package.

0 favorites 0 likes

#efficient-retrieval

@lateinteraction: Late-interaction sparse retrieval? With neuron-level inverted indexing, on top of unsupervised sparse autoencoders. Wor…

X AI KOLs Timeline ↗ · 2026-05-30 Cached

This paper presents a single-stage sparse coding method using unsupervised sparse autoencoders and natural inverted indexing to accelerate multi-vector retrieval, outperforming traditional k-means based approaches.

0 favorites 0 likes

#efficient-retrieval

SimpleMem: Efficient Lifelong Memory for LLM Agents

Papers with Code Trending ↗ · 2026-01-05 Cached

Introduces SimpleMem, an efficient memory framework for LLM agents that uses semantic lossless compression to improve accuracy and reduce token consumption, achieving 26.4% F1 improvement and up to 30x reduction in inference-time token usage.

0 favorites 0 likes

efficient-retrieval

Do All Visual Tokens Matter Equally? Object-Evidence Preserving Token Merging for Vision-Language Retrieval

@cosimorulli1: Happy to share that our recent work, TACHIOM, got integrated into the PyLate ecosystem! https://arxiv.org/pdf/2604.2814…

@lateinteraction: Late-interaction sparse retrieval? With neuron-level inverted indexing, on top of unsupervised sparse autoencoders. Wor…

SimpleMem: Efficient Lifelong Memory for LLM Agents

Submit Feedback