Tag
This guest post explores the proposed Cross-Origin Storage API to improve caching of AI model resources in Transformers.js, enabling efficient reuse across origins while maintaining privacy and integrity for in-browser inference.
A deep dive video explaining how AI models run from JavaScript using Transformers.js, covering tensors, ONNX, quantization, WebGPU/WASM, and more.
Demonstrates running Gemma 4 offline in the browser using WebGPU and Transformers.js to control a Reachy Mini robot via WebSerial.
Opus 4.7 auto-generated a custom WebGPU kernel that accelerates Qwen3.5 inference up to 13× via fused LinearAttention, now shipping in Transformers.js v4.2.0.
This article provides a technical guide on integrating Transformers.js into a Chrome extension using Manifest V3, detailing architecture for background service workers, model caching, and agent loops.