PrismML just released Binary and Ternary Bonsai Image 4B: 1-bit/ternary text-to-image diffusion transformers that can even run 100% locally in your browser on WebGPU.

Reddit r/LocalLLaMA Models

Summary

PrismML released Bonsai Image 4B models in binary and ternary quantized versions, enabling text-to-image generation to run locally in a browser via WebGPU with only 3GB size, under Apache-2.0 license.

The PrismML team really cooked with these models. They're only \~3GB in size (compared to FLUX.2 Klein 4B, which is \~16GB). Apache-2.0! Official collection on HF: [https://huggingface.co/collections/prism-ml/bonsai-image](https://huggingface.co/collections/prism-ml/bonsai-image) Link to demo: [https://huggingface.co/spaces/webml-community/bonsai-image-webgpu](https://huggingface.co/spaces/webml-community/bonsai-image-webgpu)
Original Article

Similar Articles

prism-ml/bonsai-image-ternary-4B-gemlite-2bit

Hugging Face Models Trending

Prism ML releases Bonsai Image, a 1.21 GB text-to-image diffusion transformer using ternary weights (1.58-bit) for NVIDIA GPUs, offering 4.5s / 1024² on RTX 3080 and much smaller than FP16.

1-Bit Bonsai Image 4B Image Generation for Local Devices

Hacker News Top

PrismML releases Bonsai Image 4B, a family of compact image generation models using 1-bit and ternary weights, enabling high-quality diffusion inference on local devices like laptops and iPhones with significantly reduced memory footprint.

@hank_aibtc: WTF? Image generation has completely changed! PrismML just released Bonsai Image 4B — a 1-bit binary and ternary quantized diffusion model! - Model is only ~3GB (1-bit version even compressed to 0.93GB), while the same-parameter FLUX.2 Klein 4B requires...

X AI KOLs Timeline

PrismML has released Bonsai Image 4B, a 1-bit binary and ternary quantized diffusion model, with a size of only 3GB (1-bit version 0.93GB), achieving over 8x compression compared to the same-parameter FLUX.2 Klein 4B at 16GB, and fully supports local browser execution.

prunaai/z-image-turbo

Replicate Explore

Alibaba’s 6B-parameter Z-Image-Turbo text-to-image model, further compressed by PrunaAI, generates 1024×1024 photorealistic images with bilingual text in <1s on 8 diffusion steps.

Ternary Bonsai: Top Intelligence at 1.58 Bits

Hacker News Top

A highly efficient AI model architecture using ternary weights (-1, 0, 1) that achieves competitive performance while requiring only 1.58 bits per parameter, enabling deployment on extremely constrained devices.