Show HN: Streambed – Stream Postgres to Iceberg on S3, Supports Postgres Wire
Summary
Streambed is an open-source CDC engine that streams Postgres WAL changes to Iceberg tables on S3, with a built-in query server using DuckDB that speaks the Postgres wire protocol.
View Cached Full Text
Cached at: 05/31/26, 07:36 PM
viggy28/streambed
Source: https://github.com/viggy28/streambed
Streambed
Postgres-to-Iceberg CDC engine. Offload analytical queries from your production database without changing your application.
streambed streams WAL changes via logical replication, writes Parquet files to S3, and commits Iceberg metadata. Query the result with any Iceberg-compatible engine – or use the built-in query server, which speaks the Postgres wire protocol so you can connect with psql.
See It In Action
Same analytical query on pgbench (1M accounts, 500K history rows). Postgres on the left, Streambed on the right.

No ETL. No Spark. Just Postgres + S3.
Quick Start
# Start Postgres + MinIO locally
docker compose up -d
# Build
go build -o streambed ./cmd/streambed
# Start syncing + query server on :5433
./streambed sync \
--source-url="postgres://postgres:test@localhost:5432/postgres" \
--s3-bucket="streambed" \
--s3-endpoint="http://localhost:9000" \
--s3-prefix="test" \
--query-addr=:5433
# Query your Postgres tables via Iceberg
psql -h localhost -p 5433 -U postgres -d postgres
Run streambed sync --help for all configuration options. All flags support environment variables with STREAMBED_ prefix (e.g. STREAMBED_SOURCE_URL).
Architecture
How It Works
Postgres WAL ──▶ Decode ──▶ Buffer ──▶ Parquet ──▶ S3 ──▶ Iceberg Commit
│
DuckDB ◀──┘ (query server)
Streambed connects to Postgres as a logical replication subscriber. It decodes WAL messages (inserts, updates, deletes), buffers rows per table, and periodically flushes them as Parquet files to S3 with Iceberg metadata commits. Updates and deletes use copy-on-write merging against existing Parquet data.
A query server exposes Iceberg tables over the Postgres wire protocol using embedded DuckDB, so you can query with psql or any Postgres client.
Commands
| Command | What it does |
|---|---|
streambed sync | Main daemon. Streams WAL, writes Iceberg, optionally serves queries. |
streambed resync --table=public.users | One-shot backfill via COPY under a consistent snapshot. |
streambed query | Standalone query server (no sync). Points at existing Iceberg tables. |
streambed cleanup --table=public.users | Deletes S3 objects and state for a table. Useful before resync. |
Development
Requires Go 1.22+ and CGO (for go-duckdb and go-sqlite3).
# Build
go build -o streambed ./cmd/streambed
# Unit tests
go test ./internal/... ./config/...
# Integration tests (requires Docker)
./scripts/test-integration.sh
Integration tests use the integration build tag and run against Postgres (port 5434) and MinIO (port 9002) from test/integration/docker-compose.yml.
Similar Articles
Made a Tool to Streams Changes from Microsoft SQL Server to Apache Kafka
Athena is a lightweight Change Data Capture (CDC) tool that streams changes from Microsoft SQL Server to Apache Kafka. Built in Golang, it offers simpler setup than Debezium and automatically manages CDC configurations.
Snowflake Postgres, Lakebase, HorizonDB: Picking the Lock-In You Want
An analysis of three new Postgres-compatible cloud databases—Snowflake Postgres, Databricks Lakebase, and Azure HorizonDB—highlighting their distinct architectures and the vendor lock-in implications for enterprise data platforms.
Show HN: Honker – Postgres NOTIFY/LISTEN Semantics for SQLite
Honker is a SQLite extension that adds PostgreSQL-style NOTIFY/LISTEN semantics, enabling durable pub/sub and task queues without external brokers or polling.
Launch HN: Ardent (YC P26) – Postgres sandboxes in seconds with zero migration
Ardent is a YC-backed startup that provides Postgres database sandboxes in seconds with zero migration, allowing coding agents to test code on production-like data without risk.
5x perf increase on writes with FPW disabled in Postgres
This article explains how Databricks' Lakebase architecture achieves a 5x improvement in Postgres write throughput by disabling Full Page Writes (FPW) and leveraging stateless compute with distributed storage.