Tag
An article investigating platform support for GNU extensions to Basic Regular Expressions (BRE), specifically the `\+` operator, and finding it works on FreeBSD, macOS, and musl-based distributions like Chimera Linux.
The article discusses the challenges of regex portability across tools like sed, awk, grep, and Emacs, and provides a subset of regex features that work reliably across these environments.
LlamaIndex introduces agentic retrieval with LlamaParse Index, combining semantic search and grep for flexible agent harnesses. A webinar on June 30th will demonstrate these tools.
A Twitter thread argues that the AI industry's expensive investment in vector retrieval systems for RAG may be unnecessary, as the 52-year-old terminal command 'grep' outperforms modern semantic search for exact matches in AI agent contexts.
Researchers report that the classic grep command outperforms modern vector databases in retrieval tasks for autonomous AI agents, challenging the prevailing RAG infrastructure approach.
Describes improving agentic memory search by incorporating grep-based exact matching alongside vector embeddings, inspired by a paper; achieved significant recall gains in their memory layer.
This empirical study compares grep and vector retrieval strategies in LLM agent workflows, finding that grep generally yields higher accuracy across different agent harnesses and tool-calling styles, with performance heavily dependent on harness choice and context engineering.
BohuTANG found through quantitative analysis that adding dedicated grep and glob functions to the pi tool reduces the detours of LLM-generated bash and improves efficiency.
GrepSeek trains LLM search agents to directly interact with a text corpus using shell commands like grep, using a two-stage training pipeline with cold-start dataset construction and GRPO refinement, achieving strong F1 and Exact Match on open-domain QA benchmarks.
Jerry Liu asks about the current state-of-the-art for file search and retrieval, listing options from grep to hybrid search over a database.
Recent paper investigates whether grep outperforms vector search for agentic retrieval, finding grep yields higher accuracy in conversational memory tests, but limitations around enterprise document corpora are noted.
A research paper from PwC finds that grep-style text search, when properly integrated into agent harnesses, can match or beat embedding-based retrieval for coding-agent tasks, suggesting vector databases may not be essential for many use cases.
DCI (Direct Corpus Interaction) proposes using simple terminal tools like grep and bash for agentic search, outperforming traditional retrieval methods without embeddings or vector indexes.