Tag
An AMD engineer contributed an ONNX Runtime backend to FFmpeg's DNN filter, enabling AI model inference on GPU and NPU platforms for tasks like upscaling and object detection, notably making Ryzen AI NPU useful for FFmpeg.
A benchmark comparing ONNX Runtime, HF Transformers, and GGUF for the Parakeet TDT 0.6B ASR model on CPU-only hardware shows ONNX Runtime achieves 37% faster inference than HF Transformers bfloat16, while GGUF prioritizes memory efficiency.
Supertone released Supertonic 3, an open-source TTS model with 99M parameters that runs faster on CPU than a 2B model on A100, supporting 31 languages and ONNX Runtime for fully local inference.
A blog post detailing how to detect silent NPU fallback on Snapdragon in CI, including methods like running on real hardware, gating on coefficient of variation, and parsing ORT profiling JSON to identify fallen-back ops.
Supertonic is an open-source, on-device text-to-speech system designed for local inference with minimal overhead, now releasing version 3 with support for 31 languages and improved accuracy.