Tag
Opus 4.7 auto-generated a custom WebGPU kernel that accelerates Qwen3.5 inference up to 13× via fused LinearAttention, now shipping in Transformers.js v4.2.0.
AccelOpt is a self-improving LLM agentic system that autonomously optimizes AI accelerator kernels through iterative generation and optimization memory, achieving 49-61% peak throughput improvements on AWS Trainium while being 26x cheaper than Claude Sonnet 4.