Tag
Garry Tan highlights that retrieval is the key bottleneck in voice AI and introduces Moss, an open-source tool achieving sub-10ms vector search, alongside a hackathon at YC office on June 6-7.
The author argues that most agent RAG failures are due to retrieval problems—specifically chunking errors, lack of freshness signals, and reliance on pure vector search—rather than the LLM, and recommends structural chunking, decay-based ranking, and hybrid BM25+vector search.
This repository compresses 201GB of text down to 6GB with no accuracy loss, making it 97% smaller than vector databases. It runs locally and offers a drop-in MCP for Claude, fully open source and private.