browser-inference

#browser-inference

trained a prompt injection detector using ml-intern and DeepSeek v4 Flash, runs in the browser

Reddit r/LocalLLaMA ↗ · 2026-05-22

Trained a prompt injection classifier using ml-intern and DeepSeek V4 Flash, achieving 99% F1 with DistilBERT, optimized to ONNX int8 (~65MB) and deployable in the browser via Transformers.js v3.

0 favorites 0 likes

#browser-inference

@ggerganov: Highlighting the new WebGPU backend in llama.cpp/ggml The work to bring full-fledged WebGPU support in llama.cpp starte…

X AI KOLs Following ↗ · 2026-05-22 Cached

The article highlights the new WebGPU backend in llama.cpp/ggml, enabling GPU-accelerated local AI model inference in browsers, developed by Reese Levine and team at USCS over the past year and a half.

0 favorites 0 likes

#browser-inference

@ngxson: Qwen3.6-27B running 100% on WebGPU. Not the best speed but still

X AI KOLs Following ↗ · 2026-05-17 Cached

A developer demonstrates running the Qwen3.6-27B AI model entirely on WebGPU in a browser, though speed is not optimal.

0 favorites 0 likes

browser-inference

trained a prompt injection detector using ml-intern and DeepSeek v4 Flash, runs in the browser

@ggerganov: Highlighting the new WebGPU backend in llama.cpp/ggml The work to bring full-fledged WebGPU support in llama.cpp starte…

@ngxson: Qwen3.6-27B running 100% on WebGPU. Not the best speed but still

Submit Feedback