@AYi_AInotes: Wow, Alibaba has directly open-sourced the vector database it has been using internally for years. The capability that Pinecone charges $70/month for, you can get for free with a single pip command. Billion-level vector recall in milliseconds without needing a separate service. From now on, those doing RAG and AI search no longer need to pay Pinecone $70 each month! The vector database that Alibaba has been running internally for years is open-sourced...
Summary
Alibaba has open-sourced Zvec, a vector database used internally for years. It supports billion-scale vector retrieval in milliseconds, can be embedded into application processes without a separate service, and is completely free. It serves as a replacement for paid services like Pinecone.
View Cached Full Text
Cached at: 06/21/26, 04:33 AM
Wow, Alibaba just open-sourced their internally-used vector database, and it delivers the same capability as Pinecone’s $70/month plan with just a single pip install for free — billion-scale vector search in milliseconds, no separate service needed.
From now on, anyone building RAG or AI search can skip paying Pinecone $70 every month.
Alibaba’s long-running internal vector database is called Zvec, and you can start using it with one pip install — completely free.
Three of the most hardcore features:
- Billion-scale vector retrieval in milliseconds, no separate service required — embed it directly into your application process.
- Runs everywhere: servers, desktops, even Raspberry Pi.
- Official SDKs for all major languages. v0.5.0 adds native hybrid full-text search — vector + keyword filters in one query.
I think Alibaba just took their own production-grade wheel and gave it away to the entire industry. Now the infrastructure layer for AI applications has another free, reliable option.
pip install zvec
AYi (@AYi_AInotes): Humanity still can’t write the physical equation for a fried egg. Drop an egg into a hot oil pan — how it coagulates, spreads, and browns at the edge — no formula can describe it. This kind of example is countless in the physical world.
And that’s exactly the ceiling of the current general AI paradigm: video generation and VLA models all learn statistical correlations at the pixel level.
Similar Articles
@HowToPrompt__: China open-sourced a vector database that destroys Pinecone, Chroma, and Weaviate. It's called Zvec, an in-process vect…
China open-sourced Zvec, an in-process vector database that runs inside apps without servers, supporting billions of vector searches in milliseconds and battle-tested at Alibaba scale.
@vintcessun: Compressing 10 million vectors from 31GB to 4GB, with search even faster than FAISS — sounds crazy, but Turbovec actually did it. The core is Google's TurboQuant data-independent quantization: no training, no parameter tuning, just add vectors and index. Handwritten NEON/AVX-512 implementations are genuinely 12-20% faster, supporting filtered search by ID, saving a ton of post-processing hassle. Rust under the hood + pip install, minimal maintenance cost.
Turbovec, based on Google's TurboQuant algorithm, compresses 10 million vectors from 31GB to 4GB, with search speed 12-20% faster than FAISS, supports filtered search, and offers a Rust implementation with a Python package.
alibaba/zvec
Alibaba releases Zvec v0.5.0, an open-source in-process vector database with new features including full-text search, hybrid retrieval, DiskANN index, and new SDKs for Go, Rust, plus a visual tool.
@MaxForAI: http://Z.ai and this ZCube paper from Tsinghua—worth a read for anyone in Infra. Many people's first reaction when talking about AI infra is still GPU, memory, quantization, and inference frameworks. But once you get into long context and Prefill-Decode separation, the network is no longer just a 'supporting role' in the data center. Every...
ZCube is a new network architecture that flattens the topology and mixes single/multi-rail access to optimize KV Cache transmission in long-context and PD separation scenarios. In the GLM-5.1 production cluster, it achieved a 33% reduction in switch/optical module costs, a 15% increase in GPU inference throughput, and a 40.6% decrease in TTFT P99.
@oragnes: Recently discovered a hardcore open-source project from Harness: pi (recently moved under earendil-works from badlogic). It is an all-in-one AI Agent infrastructure suite plus a terminal programming assistant CLI designed to backstop developers. Stop reinventing the wheel: it provides a ready-made…
Pi is an open-source AI Agent infrastructure suite and terminal programming assistant CLI. It offers a unified API to bridge differences between multiple models, supports concurrent tool calling to reduce latency, and allows developers to control the thinking budget.