AMD's Lemonade SDK for local AI adds NVIDIA CUDA support
Summary
AMD's Lemonade SDK for local AI adds NVIDIA CUDA support in version 10.7, enabling the same local AI server experience on competitor GPUs. The release also introduces lemonade bench for cross-backend LLM benchmarking and broader Vulkan support.
View Cached Full Text
Cached at: 06/11/26, 02:07 PM
AMD’s Lemonade SDK For Local AI Adds NVIDIA CUDA Support
Source: https://www.phoronix.com/news/AMD-Lemonade-10.7-Released

Lemonade, the local AI server solution developed by AMD that is designed to work across their CPUs, GPUs, and NPUs, is out with a new version today that also adds NVIDIA CUDA support.
THe Lemonade SDK provides local AI server capabilities in an API-compliant manner with OpenAI, Anthropic, and Ollama APIs. Lemonade builds off FastFlowLM, vLLM, Llama.cpp, and other components for a rich, open-source local AI experience. Beyond supporting their own wares of AMD Ryzen AI NPUs, Radeon/Instinct GPU accelerators, and x86_64 CPUs, they have also supported Apple Metal GPUs and AArch64 CPU support too. Interestingly, with Lemonade 10.7 they have also now added NVIDIA CUDA support for allowing the same local AI server experience on their competitor’s GPUs.
Lemonae 10.7 now properly integrates Llama.cpp’s CUDA back-end on Windows and Linux with proper NVIDIA GPU detection in Lemonade and other integration bits. The stable-diffusion.cpp CUDA back-end is also added for Linux. Additionally, this release brings stable-diffusion.cpp Vulkan support on both Windows and Linux for broader cross-vendor GPU support.
Lemonade 10.7 also adds support for LMX-Omni models, a native Prometheus end-point for real time stats monitoring, and other enhancements.
Exciting me with Lemonade 10.7 is adding thelemonade benchcommand that is focused on apples-to-apples LLM benchmarking across Llama.cpp, FastFlowLM, vLLM, and Ryzen AI software. I’ll be checking out the lemonade bench to see how its benchmarking works out and hopefully using it in future articles on Phoronix.
Lemonade 10.7 downloads and more details on this open-source feature release viaGitHub.
Similar Articles
Lemonade v10.7 release and project organization update
Lemonade v10.7 release introduces LMX-Omni virtual models for omni-modal chat, a bench CLI tool for LLM performance comparison across backends, and expanded GPU support on AMD, Apple Silicon, Nvidia, and Intel systems.
NVIDIA Levels Up Local AI Agents Across RTX PCs and DGX Spark
NVIDIA announced RTX Spark PCs and a wave of updates to enable local AI agents across RTX and DGX ecosystems, including the OpenShell runtime coming to Windows, NemoClaw expansion, performance improvements, and integrations with Adobe and H Company.
AMD Powers Next-Generation Agent Computers with New Ryzen AI Halo Developer Platform and Ryzen AI Max PRO 400 Series Processors
AMD announces the Ryzen AI Halo developer platform for local AI development, capable of running models up to 200B parameters, and the Ryzen AI Max PRO 400 Series processors for commercial AI PCs, enabling agentic AI workloads locally.
AMD's tiny AI PC points to a more local future for model inference
AMD's Ryzen AI Max platform with 128GB unified memory enables local inference of large models up to 200 billion parameters, aiming to shift AI workloads from cloud to compact personal hardware.
AMD AI ENGAGE
The article discusses the AMD AI Engage Program, a community initiative for AI developers offering prizes, credits, and networking opportunities for building LLM apps and GenAI workflows.