@akshay_pachaar: Naive RAG vs. Blockify! There's a new RAG approach that: - cuts corpus size by 40x. - reduces tokens per query by 3x. -…
Summary
Blockify is a new open-source RAG framework that replaces naive chunking with a patented 'IdeaBlocks' pipeline, claiming 40x corpus size reduction, 3x token efficiency, and 2.3x vector search accuracy improvements. It transforms enterprise documents into structured XML knowledge units for more coherent LLM retrieval.
View Cached Full Text
Cached at: 05/09/26, 09:43 AM
Naive RAG vs. Blockify! There’s a new RAG approach that: - cuts corpus size by 40x. - reduces tokens per query by 3x. - improves vector search relevance by 2.3x. Blockify GitHub: https://github.com/iternal-technologies-partners/blockify-agentic-data-optimization…
iternal-technologies-partners/blockify-agentic-data-optimization
Source: https://github.com/iternal-technologies-partners/blockify-agentic-data-optimization
Transform messy enterprise content into compact, validated knowledge units optimized for AI
Patented data ingestion, distillation, and governance pipeline. IdeaBlocks replace naive chunking with structured, deduplicated, LLM-ready knowledge.
78X Aggregate Performance · 2.29X Vector Search Accuracy · 29.93X Distillation · 3.09X Token Efficiency · 40X Size Reduction
What is Blockify?
Traditional Retrieval-Augmented Generation (RAG) pipelines split documents into fixed-size chunks, then hope that vector similarity will surface the right context. It rarely does. Chunks break mid-sentence, duplicate content inflates token bills, and hallucinations slip through because the LLM is reasoning over fragments rather than facts.
Blockify replaces naive chunking with a patented ingestion and distillation pipeline that transforms raw enterprise content into IdeaBlocks — structured, semantically complete XML knowledge units. Every IdeaBlock carries its own question, trusted answer, tags, entities, and keywords. Similar blocks are deduplicated and merged so the knowledge base stays compact, coherent, and governable.
An IdeaBlock is the smallest unit of curated knowledge: a self-contained XML unit with a
name,critical_question,trusted_answer,tags,entity, andkeywords. Unlike fixed-size chunks, IdeaBlocks preserve full semantic coherence.
How It Works
- Ingest — Documents (SharePoint, Confluence, Git, local docs) are parsed and transformed into structured IdeaBlocks via the Blockify API
- Distill — Similar blocks are clustered with embeddings + LSH, then merged by the
distillLLM to eliminate duplicates while preserving distinct facts - Retrieve — Optimized blocks are stored in vector databases (ChromaDB, Pinecone, Cloudflare Vectorize, Neo4j) for high-accuracy RAG retrieval
Features
| Semantic Ingestion — Transforms raw text into structured IdeaBlocks with question/answer alignment, entities, and tags | Intelligent Distillation — Deduplicates and merges similar blocks using embeddings, LSH clustering, and LLM synthesis |
| 40X Compression — Reduces enterprise datasets to ~2.5% of original size while preserving 99%+ information fidelity | 2.29X Search Accuracy — IdeaBlocks dramatically outperform naive chunks in vector similarity retrieval |
| Production-Ready Service — Docker, Helm, Prometheus metrics, OpenTelemetry tracing, health checks | Claude Code Skill — First-class integration as a Claude Code skill for developer workstations |
| Pluggable Storage — SQLite, PostgreSQL, Redis, or filesystem backends for the distillation service | Benchmark Suite — Built-in benchmarking with HTML reports to quantify ROI on your own data |
Repository Structure
This repository contains two deployable components plus comprehensive technical documentation:
| Component | Description | Path |
|---|---|---|
| Distillation Service | FastAPI microservice for IdeaBlock deduplication and merging | blockify-distillation-service/ |
| Claude Code Skill | Skill package for document ingestion, distillation, semantic search, and benchmarks | blockify-skill-for-claude-code/ |
| Documentation | Technical guides covering architecture, API, setup, research, and 12 platform integrations | documentation/ |
Full directory tree
blockify-agentic-data-optimization/
├── blockify-distillation-service/ FastAPI microservice
│ ├── app/ Source (api, service, dedupe, llm, db)
│ ├── tests/ Pytest suite
│ ├── helm/ Kubernetes Helm chart
│ ├── Dockerfile
│ ├── docker-compose.yml
│ └── requirements.txt
├── blockify-skill-for-claude-code/ Claude Code skill package
│ └── skills/blockify-integration/
│ ├── SKILL.md Skill definition
│ ├── scripts/ Ingest, distill, search, benchmark
│ ├── references/ API, schema, distillation docs
│ └── tests/
├── documentation/ Technical guides + platform integrations
│ ├── BLOCKIFY-DEEP-DIVE.md
│ ├── BLOCKIFY-API-REFERENCE.md
│ ├── ARCHITECTURE-END-TO-END.md
│ ├── IDEABLOCK-STRUCTURE.md
│ ├── GETTING-STARTED-GUIDE.md
│ ├── LOCAL-VECTOR-DATABASE-SETUP.md
│ ├── DISTILLATION-SERVICE.md
│ ├── CLAUDE-CODE-BLOCKIFY-SKILL.md
│ ├── OPENCLAW-RAG-INTEGRATION.md
│ ├── RAG-AGENTIC-SEARCH-RESEARCH.md
│ └── integrations/ 12 platform integration guides
│ ├── BLOCKIFY-OBSIDIAN.md
│ ├── BLOCKIFY-LLAMAINDEX.md
│ ├── BLOCKIFY-LANGCHAIN.md
│ ├── BLOCKIFY-N8N.md
│ ├── BLOCKIFY-ELASTIC.md
│ ├── BLOCKIFY-SUPABASE.md
│ ├── BLOCKIFY-STARBURST.md
│ ├── BLOCKIFY-KIBANA.md
│ ├── BLOCKIFY-CLOUDFLARE.md
│ ├── BLOCKIFY-MILVUS.md
│ ├── BLOCKIFY-ZILLIZ.md
│ └── BLOCKIFY-UNSTRUCTURED.md
├── assets/images/ README visuals + AI image prompts
├── CONTRIBUTING.md
├── SECURITY.md
├── LICENSE
└── README.md
Quick Start
New to Blockify? Sign up for free to get $1,000 in API credits, then pick the path below that fits your workflow.
Path 1 — Test the API in 30 seconds
curl --location 'https://api.blockify.ai/v1/chat/completions' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"model": "ingest",
"messages": [{"role": "user", "content": "Your text to process here"}],
"max_tokens": 8000,
"temperature": 0.5
}'
Path 2 — Claude Code Skill (5 minutes)
git clone https://github.com/iternal-technologies-partners/blockify-agentic-data-optimization.git
cd blockify-agentic-data-optimization/blockify-skill-for-claude-code/skills/blockify-integration
pip install -r requirements.txt
python3 scripts/setup_check.py
# Ingest your docs, distill, and run semantic search
python3 scripts/run_full_pipeline.py --source ./my-docs
See blockify-skill-for-claude-code/ for full skill documentation.
Path 3 — Distillation Service (Docker)
cd blockify-distillation-service
cp .env.example .env
# Edit .env with your BLOCKIFY_API_KEY and OPENAI_API_KEY
docker-compose up -d
# Verify
curl http://localhost:8315/healthz
Path 4 — Deploy with Helm (Kubernetes)
cd blockify-distillation-service/helm/blockify-distillation
# Edit values.yaml with your config
helm install blockify-distill . --namespace blockify --create-namespace
kubectl get pods -n blockify
kubectl port-forward svc/blockify-distill 8315:8315 -n blockify
Supports Prometheus ServiceMonitor, PVC for SQLite persistence, Ingress, and Secret management out of the box.
Enterprise Edition
The open-source components in this repository are a fully capable starting point. For production workloads at enterprise scale, Blockify Enterprise provides productionized containers with significantly expanded capabilities.
What Enterprise Adds on Top of Open Source
| Capability | Open Source | Enterprise |
|---|---|---|
| IdeaBlock ingestion & distillation API | Yes | Yes |
| Self-hosted distillation microservice | Yes | Yes (hardened, pre-built containers) |
| Advanced distillation algorithms (hierarchical, multi-pass, domain-tuned) | — | Yes |
| Automated ingestion pipelines (scheduled connectors for SharePoint, Confluence, Drive, S3, Git) | — | Yes |
| Enterprise connectors & parsers (PDF, DOCX, PPTX, HTML, Markdown, structured data) | Basic | Full suite |
| Role-based access control & audit logging | — | Yes |
| Governance dashboard & content lifecycle management | — | Yes |
| Priority support & SLAs | — | Yes |
| Air-gapped / on-prem deployment (AirgapAI) | — | Yes |
| Professional services & implementation support | — | Yes |
Who is Enterprise for? Teams ingesting millions of documents, running regulated workloads, needing automated refresh pipelines, or deploying in air-gapped environments.
API Models
The Blockify API exposes three models via an OpenAI-compatible chat completions endpoint:
| Model | API Name | Use Case |
|---|---|---|
| Blockify Ingest | ingest | Convert raw text to IdeaBlocks |
| Blockify Distill | distill | Merge and deduplicate similar IdeaBlocks |
| Technical Manual Ingest | technical-ingest | Ordered content (manuals, procedures, runbooks) |
See BLOCKIFY-API-REFERENCE.md for full endpoint documentation.
The IdeaBlock Format
Every IdeaBlock is a self-contained XML knowledge unit:
<ideablock>
<name>Blockify Overview</name>
<critical_question>What is Blockify?</critical_question>
<trusted_answer>Blockify is an agentic data optimization pipeline that converts unstructured enterprise content into compact, deduplicated XML IdeaBlocks to improve retrieval accuracy and reduce token usage in RAG and LLM workflows.</trusted_answer>
<tags>RAG, DATA_OPTIMIZATION, KNOWLEDGE_MANAGEMENT</tags>
<entity>
<entity_name>BLOCKIFY</entity_name>
<entity_type>PRODUCT</entity_type>
</entity>
<keywords>blockify, ideablock, RAG, distillation, deduplication, enterprise AI</keywords>
</ideablock>
Full field specification
| Field | Required | Purpose |
|---|---|---|
name | Yes | Short human-readable title |
critical_question | Yes | The question this block answers |
trusted_answer | Yes | Verified answer content |
tags | Yes | Comma-separated topical tags |
entity / entity_name | Yes | Primary entity (product, person, concept) |
entity / entity_type | Yes | Entity classification |
keywords | Yes | Search keywords for retrieval |
Full schema in IDEABLOCK-STRUCTURE.md.
Performance
| Metric | Improvement | What It Means |
|---|---|---|
| Aggregate Enterprise Performance | 78X | Combined effect across the full pipeline |
| Vector Search Accuracy | 2.29X | Measurably more relevant results, fewer false matches |
| Information Distillation | 29.93X | Enterprise-wide deduplication factor |
| Token Efficiency | 3.09X | Substantial cost savings at scale |
| Dataset Size Reduction | 40X | From 100% down to ~2.5% of original |
Prove the numbers on your own data: run the built-in benchmark suite from the Claude Code skill —
python3 scripts/run_benchmark.py --company "Your Company"— and get an HTML report comparing IdeaBlocks vs. traditional chunking.
Documentation
| Document | Description | Audience |
|---|---|---|
| Getting Started Guide | Step-by-step setup for any skill level | Everyone |
| Blockify Deep Dive | Complete technical understanding | All Engineers |
| IdeaBlock Structure | XML format specification | Data Engineers |
| API Reference | API endpoints and examples | Backend Engineers |
| Architecture (End-to-End) | Complete integration architecture | Architects |
| Distillation Service | Deduplication algorithm reference | Platform Engineers |
| Local Vector DB Setup | ChromaDB setup for 100k+ blocks | DevOps |
| Claude Code Skill Guide | Skill installation and usage | Claude Code users |
| OpenClaw RAG Integration | Chatbot + Blockify implementation | Full-Stack Engineers |
| RAG & Agentic Search Research | Architecture patterns research | All Engineers |
| Platform Integrations | 12 integration guides (Obsidian, LlamaIndex, LangChain, n8n, Elastic, Supabase, Starburst, Kibana, Cloudflare, Milvus, Zilliz, Unstructured.io) | All Engineers |
Integrations
Blockify sits between your document source and your retrieval / storage layer. It plugs into every major RAG framework, vector database, data platform, and workflow engine. Each guide below covers the problem Blockify solves on that stack, an architecture diagram, quick-start code, advanced patterns, and a side-by-side comparison with the platform’s default behavior.
RAG Frameworks
| Platform | Use Case | Guide |
|---|---|---|
| LlamaIndex | Drop-in NodeParser producing deduplicated TextNodes | Blockify + LlamaIndex |
| LangChain | BaseDocumentTransformer for any RAG chain or LangGraph agent | Blockify + LangChain |
Knowledge & Workflow
| Platform | Use Case | Guide |
|---|---|---|
| Obsidian | Turn a personal or team vault into a high-accuracy RAG knowledge base | Blockify + Obsidian |
| n8n | No-code HTTP node for AI workflow automation | Blockify + n8n |
Vector & Search Databases
| Platform | Use Case | Guide |
|---|---|---|
| Milvus | Self-hosted billion-scale vector DB with hybrid dense + BM25 retrieval | Blockify + Milvus |
| Zilliz Cloud | Managed Milvus with autoscaling and serverless pricing | Blockify + Zilliz Cloud |
| Elastic | Hybrid BM25 + ELSER + dense retrieval on deduplicated IdeaBlocks | Blockify + Elastic |
| Supabase | Postgres + pgvector with row-level security on IdeaBlock tags | Blockify + Supabase |
| Cloudflare | Edge-native RAG on Workers + Vectorize + R2 + Workers AI | Blockify + Cloudflare |
Data Platform & Observability
| Platform | Use Case | Guide |
|---|---|---|
| Starburst | Federated IdeaBlock generation across data-lake catalogs (Trino / Iceberg) | Blockify + Starburst |
| Kibana | Governance dashboards for knowledge-base coverage, drift, and retrieval | Blockify + Kibana |
Document Parsing
| Platform | Use Case | Guide |
|---|---|---|
| Unstructured.io | Parse PDF, DOCX, PPTX, HTML, email, images — then Blockify | Blockify + Unstructured.io |
See the full integrations index for pattern references. Don’t see your stack? Blockify exposes an OpenAI-compatible API — if your platform can POST HTTP, it can use Blockify.
Generic Pattern
Documents -> Parser -> Blockify (Ingest + Distill) -> Embeddings -> Vector DB -> LLM / Agent
Claude Code — Install the skill (see Path 2) to let Claude Code ingest project documentation into local ChromaDB and perform high-accuracy semantic retrieval during development work.
Chatbot & Customer Support — Blockify-processed knowledge reduces hallucination risk and improves answer quality in production chatbots. See OPENCLAW-RAG-INTEGRATION.md for a Cloudflare Workers example.
Frequently Asked Questions
What is Blockify and how is it different from naive RAG chunking?
Blockify is a patented ingestion and distillation pipeline that replaces fixed-size text chunking with IdeaBlocks — structured XML knowledge units containing a name, critical question, trusted answer, tags, entity, and keywords. Unlike RecursiveCharacterTextSplitter (LangChain) or SentenceSplitter (LlamaIndex), Blockify is semantically aware, deduplicates across the corpus, and produces ~2.5% of the original dataset size while preserving 99%+ information fidelity.
How does Blockify improve vector search accuracy?
On real enterprise corpora, Blockify delivers 2.29X improvement in vector search precision (average-distance-to-best-match from 0.3624 to 0.1585). Because duplicates are collapsed before vectorization, the top-K neighbors are semantically distinct rather than near-duplicates of the same boilerplate.
Does Blockify work with my existing vector database?
Yes. Blockify is embedding-model and vector-database agnostic. There are dedicated integration guides for Milvus, Zilliz Cloud, Elastic, Supabase, and Cloudflare Vectorize. Pinecone, ChromaDB, Qdrant, Weaviate, and pgvector work through the LangChain and LlamaIndex adapters.
Does Blockify replace LlamaIndex or LangChain?
No — Blockify composes with LlamaIndex and LangChain. It replaces the chunking stage (NodeParser / TextSplitter) with a higher-quality transformer that produces IdeaBlock-backed nodes or documents. See Blockify + LlamaIndex and Blockify + LangChain.
How does Blockify reduce LLM token costs?
Blockify delivers 3.09X token efficiency — ~98 tokens per retrieved block vs. ~303 per traditional chunk. On 1B queries/year this translates to ~$738,000 in token cost savings. Cost reduction comes from two compounding sources: (1) 40X fewer embeddings to generate and store, and (2) denser retrieved context means fewer tokens per LLM call.
What’s the difference between Blockify Ingest and Blockify Distill?
Ingest converts raw text into draft IdeaBlocks. Distill clusters similar IdeaBlocks (embeddings + LSH + LLM synthesis) and merges them into canonical blocks, eliminating duplicates. A typical pipeline runs Ingest per-document then Distill across the corpus. Both are exposed as models via the same OpenAI-compatible API.
Can Blockify run air-gapped / offline?
Yes. The open-source distillation service runs fully offline with a local LLM runtime (VLLM, NVIDIA NIM, Intel OpenVino). Blockify Enterprise ships air-gapped deployment (AirgapAI) with hardened containers for classified environments.
Is Blockify open source?
The Claude Code skill, distillation microservice, Helm chart, benchmark suite, and 12 integration guides in this repository are open source under the Blockify EULA. Blockify Enterprise adds hardened containers, scheduled connectors, RBAC, governance dashboards, and professional services — see Enterprise Edition.
How do I migrate from naive chunking to Blockify?
Three paths: (1) run the Claude Code skill against your docs for immediate local results, (2) add a single HTTP call to api.blockify.ai/v1/chat/completions in your existing pipeline, or (3) self-host the distillation service via Docker or Helm. The Getting Started Guide walks through all three.
What document formats does Blockify support?
Via Unstructured.io or native parsers: DOCX, PDF, PPTX, PNG/JPG (OCR), Markdown, HTML, email. See Blockify + Unstructured.io for the recommended parsing pipeline.
Contributing
Contributions are welcome. See CONTRIBUTING.md for development setup, code style, and the pull request process.
This project is governed by the Blockify Community License. By contributing, you agree that your contributions will be subject to these terms, including the Contribution License (Section 2.4).
Security issues should be reported privately — see SECURITY.md.
Support & Community
| Channel | Link |
|---|---|
| Enterprise Sales (productionized containers) | [email protected] |
| Technical Support | [email protected] |
| Website | iternal.ai/blockify |
| API Console | console.blockify.ai |
| GitHub Issues | Open an issue |
License
This project is licensed under the Blockify Community License. Free for developers, researchers, and companies under $1M annual revenue. Organizations exceeding $1M annual revenue require an enterprise license — contact [email protected]. See ENTERPRISE.md for details on enterprise capabilities.
Blockify, IdeaBlock, and AirgapAI are trademarks of Iternal Technologies, Inc.
Similar Articles
RAG-Anything: All-in-One RAG Framework
RAG-Anything is a new open-source framework that enhances multimodal knowledge retrieval by integrating cross-modal relationships and semantic matching, outperforming existing methods on complex benchmarks.
AgenticRAG: Agentic Retrieval for Enterprise Knowledge Bases
This paper introduces AgenticRAG, a framework from Microsoft that enhances enterprise knowledge base retrieval by equipping LLMs with tools for iterative search, document navigation, and analysis. It demonstrates significant improvements in recall and factuality over standard RAG pipelines on multiple benchmarks.
CHOP: Chunkwise Context-Preserving Framework for RAG on Multi Documents
CHOP is a framework for improving RAG systems on multi-document retrieval by using context-aware metadata and LLM-based chunk relevance evaluation to reduce semantic conflicts and hallucinations. The approach achieves 90.77% Top-1 Hit Rate through intelligent chunking and contextual preservation strategies.
LightRAG: Simple and Fast Retrieval-Augmented Generation
The article introduces LightRAG, an open-source framework that enhances Retrieval-Augmented Generation by integrating graph structures for improved contextual awareness and efficient information retrieval.
LatentRAG: Latent Reasoning and Retrieval for Efficient Agentic RAG
LatentRAG is a novel framework that shifts reasoning and retrieval for agentic RAG into continuous latent space, reducing inference latency by approximately 90% while maintaining performance comparable to explicit methods.