@Modular: Modular is live on @ArtificialAnlys with 3x faster image generation than the competition. MAX inference serving @bfl_ai…
Summary
Modular's MAX inference serving achieves 3x faster image generation for FLUX.2-dev than competitors, as per Artificial Analysis benchmarks.
View Cached Full Text
Cached at: 07/02/26, 06:26 PM
Modular is live on @ArtificialAnlys with 3x faster image generation than the competition. MAX inference serving @bfl_ai’s FLUX.2-dev achieves state of the art latency per AA’s new benchmarking: https://artificialanalysis.ai/image/providers/flux_flux-2–dev…
FLUX.2 [dev] API Provider Benchmarking and Analysis | Artificial Analysis
Source: https://artificialanalysis.ai/image/providers/flux_flux-2–dev
API Generation Time
Generation time: seconds to generate 1 image · Lower is better
Median time the provider takes to generate an image over the past 14 days of measurements. This includes downloading the image from the provider where a URL is provided rather than an image response.
API Price
Price: USD per 1000 image generations · Lower is better
Price per 1k images generated by the model. For detail on how we calculate price per image for providers price based on inference time or steps, see ourmethodology page.
API Generation Time vs. API Price
Generation time: seconds to generate 1 image · Price: USD per 1000 image generations
Median time the provider takes to generate an image over the past 14 days of measurements. This includes downloading the image from the provider where a URL is provided rather than an image response.
Price per 1k images generated by the model. For detail on how we calculate price per image for providers price based on inference time or steps, see ourmethodology page.
Recent Image Generations
Matched image prompt/seed generations for each selected provider
Tapeworm Infection and Plant Bug in Kinshasa, DR Congo in the style of expressionist
Jul 2, 3:08 PM5providers, fastest on left

FLUX.2 [dev], Modular2.1s

FLUX.2 [dev], DeepInfra6.3s

FLUX.2 [dev], Replicate6.8s

FLUX.2 [dev], fal.ai7.7s

FLUX.2 [dev], Runware11.4s
Bassarisk and Skipjack in Kampala, Uganda in the style of cubist
Jul 2, 7:40 AM5providers, fastest on left

FLUX.2 [dev], Modular2.1s

FLUX.2 [dev], fal.ai6.1s

FLUX.2 [dev], DeepInfra6.2s

FLUX.2 [dev], Runware8.6s

FLUX.2 [dev], Replicate10.1s
Ribbonfish and Tautog in Manila, Philippines in the style of rococo
Jul 2, 4:02 AM5providers, fastest on left

FLUX.2 [dev], Modular2.4s

FLUX.2 [dev], DeepInfra6.2s

FLUX.2 [dev], fal.ai8.4s

FLUX.2 [dev], Runware10.3s

FLUX.2 [dev], Replicate41.2s
Pike and Dugong in Khartoum, Sudan in the style of art deco
Jul 1, 7:08 PM5providers, fastest on left

FLUX.2 [dev], Modular2.1s

FLUX.2 [dev], fal.ai6.0s

FLUX.2 [dev], DeepInfra6.2s

FLUX.2 [dev], Runware9.7s

FLUX.2 [dev], Replicate17.4s
Chine and Feeler in Moscow, Russia in the style of modernist
Jul 1, 1:09 PM4providers, fastest on left

FLUX.2 [dev], Modular2.1s

FLUX.2 [dev], Replicate6.9s

FLUX.2 [dev], fal.ai10.1s

FLUX.2 [dev], DeepInfra12.2s
Speed Analysis
API Generation Time, Variance
Generation time: seconds to generate 1 image · Results by percentile · Lower is better
Median time the provider takes to generate an image over the past 14 days of measurements. This includes downloading the image from the provider where a URL is provided rather than an image response.

API Generation Time, Over Time
Generation time: seconds to generate 1 image · Lower is better
Median time the provider takes to generate an image over the past 14 days of measurements. This includes downloading the image from the provider where a URL is provided rather than an image response.
Each point is the median of a 14-day window of measurements, based on 1 measurement each day at different times.
Similar Articles
@Modular: Day two of AI Engineer World's Fair @aiDotEngineer! Meet the team at booth U-G28 to talk all things Mojo, MAX, and Modu…
Modular is at the AI Engineer World's Fair demonstrating Mojo, MAX, and Modular Cloud with fast image generation using FLUX.2 on DGX Spark, and real-time video generation.
@Modular: .@hippocraticai runs 400B+ parameter models for real-time patient conversations, tens of thousands per day. When they b…
Hippocratic AI partners with Modular to use MAX framework for inference on large language models, achieving sub-500ms TTFT, ~30% faster P99 latency and ~22% faster mean latency at scale on NVIDIA B300 GPUs, with portability to AMD.
@Modular: Our kernel team has been deep in MiniMax M3 all week. The 1M-token context and native multimodality make it a hard mode…
Modular's kernel team is optimizing serving for MiniMax M3's 1M-token context and native multimodality, with open weights dropping soon for immediate deployment on Modular.
@Modular: .@zai_org open-sourced GLM 5.2 today, and Modular is a Day Zero launch partner. GLM 5.2 is their new flagship for codin…
Zhipu AI (zai_org) has open-sourced GLM 5.2, a flagship model for coding and long-horizon agentic tasks with a usable 1M-token context. Modular is a Day Zero launch partner, offering optimized serving on Modular Cloud.
Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance
Moebius is a 0.22B parameter image inpainting framework that rivals 10B-level models like FLUX.1-Fill-Dev, achieving over 15x faster inference through novel local-global interaction blocks and adaptive distillation strategies.