efficient-retrieval

#efficient-retrieval

@vintcessun: Compressing 10 million vectors from 31GB to 4GB, with search even faster than FAISS — sounds crazy, but Turbovec actually did it. The core is Google's TurboQuant data-independent quantization: no training, no parameter tuning, just add vectors and index. Handwritten NEON/AVX-512 implementations are genuinely 12-20% faster, supporting filtered search by ID, saving a ton of post-processing hassle. Rust under the hood + pip install, minimal maintenance cost.

X AI KOLs Timeline ↗ · 2d ago Cached

Turbovec, based on Google's TurboQuant algorithm, compresses 10 million vectors from 31GB to 4GB, with search speed 12-20% faster than FAISS, supports filtered search, and offers a Rust implementation with a Python package.

0 favorites 0 likes

#efficient-retrieval

@lateinteraction: Late-interaction sparse retrieval? With neuron-level inverted indexing, on top of unsupervised sparse autoencoders. Wor…

X AI KOLs Timeline ↗ · 2026-05-30 Cached

This paper presents a single-stage sparse coding method using unsupervised sparse autoencoders and natural inverted indexing to accelerate multi-vector retrieval, outperforming traditional k-means based approaches.

0 favorites 0 likes

#efficient-retrieval

SimpleMem: Efficient Lifelong Memory for LLM Agents

Papers with Code Trending ↗ · 2026-01-05 Cached

Introduces SimpleMem, an efficient memory framework for LLM agents that uses semantic lossless compression to improve accuracy and reduce token consumption, achieving 26.4% F1 improvement and up to 30x reduction in inference-time token usage.

0 favorites 0 likes

efficient-retrieval

@lateinteraction: Late-interaction sparse retrieval? With neuron-level inverted indexing, on top of unsupervised sparse autoencoders. Wor…

SimpleMem: Efficient Lifelong Memory for LLM Agents

Submit Feedback