@Modular: Modular is live on @ArtificialAnlys with 3x faster image generation than the competition. MAX inference serving @bfl_ai…

X AI KOLs Following 07/02/26, 06:16 PM Tools

inference-serving image-generation benchmarking performance flux modular

Summary

Modular's MAX inference serving achieves 3x faster image generation for FLUX.2-dev than competitors, as per Artificial Analysis benchmarks.

Modular is live on @ArtificialAnlys with 3x faster image generation than the competition. MAX inference serving @bfl_ai’s FLUX.2-dev achieves state of the art latency per AA’s new benchmarking: https://artificialanalysis.ai/image/providers/flux_flux-2--dev…

Original Article

View Cached Full Text

Cached at: 07/02/26, 06:26 PM

Modular is live on @ArtificialAnlys with 3x faster image generation than the competition. MAX inference serving @bfl_ai’s FLUX.2-dev achieves state of the art latency per AA’s new benchmarking: https://artificialanalysis.ai/image/providers/flux_flux-2–dev…

FLUX.2 [dev] API Provider Benchmarking and Analysis | Artificial Analysis

Source: https://artificialanalysis.ai/image/providers/flux_flux-2–dev

API Generation Time

Generation time: seconds to generate 1 image · Lower is better

Median time the provider takes to generate an image over the past 14 days of measurements. This includes downloading the image from the provider where a URL is provided rather than an image response.

API Price

Price: USD per 1000 image generations · Lower is better

Price per 1k images generated by the model. For detail on how we calculate price per image for providers price based on inference time or steps, see ourmethodology page.

API Generation Time vs. API Price

Generation time: seconds to generate 1 image · Price: USD per 1000 image generations

Median time the provider takes to generate an image over the past 14 days of measurements. This includes downloading the image from the provider where a URL is provided rather than an image response.

Price per 1k images generated by the model. For detail on how we calculate price per image for providers price based on inference time or steps, see ourmethodology page.

Recent Image Generations

Matched image prompt/seed generations for each selected provider

Tapeworm Infection and Plant Bug in Kinshasa, DR Congo in the style of expressionist

Jul 2, 3:08 PM5providers, fastest on left

Tapeworm Infection and Plant Bug in Kinshasa, DR Congo in the style of expressionist

Modular FLUX.2 [dev], Modular2.1s

Tapeworm Infection and Plant Bug in Kinshasa, DR Congo in the style of expressionist

FLUX.2 [dev], DeepInfra6.3s

Tapeworm Infection and Plant Bug in Kinshasa, DR Congo in the style of expressionist

FLUX.2 [dev], Replicate6.8s

Tapeworm Infection and Plant Bug in Kinshasa, DR Congo in the style of expressionist

fal.ai FLUX.2 [dev], fal.ai7.7s

Tapeworm Infection and Plant Bug in Kinshasa, DR Congo in the style of expressionist

Runware FLUX.2 [dev], Runware11.4s

Bassarisk and Skipjack in Kampala, Uganda in the style of cubist

Jul 2, 7:40 AM5providers, fastest on left

Bassarisk and Skipjack in Kampala, Uganda in the style of cubist

Modular FLUX.2 [dev], Modular2.1s

Bassarisk and Skipjack in Kampala, Uganda in the style of cubist

fal.ai FLUX.2 [dev], fal.ai6.1s

Bassarisk and Skipjack in Kampala, Uganda in the style of cubist

FLUX.2 [dev], DeepInfra6.2s

Bassarisk and Skipjack in Kampala, Uganda in the style of cubist

Runware FLUX.2 [dev], Runware8.6s

Bassarisk and Skipjack in Kampala, Uganda in the style of cubist

FLUX.2 [dev], Replicate10.1s

Ribbonfish and Tautog in Manila, Philippines in the style of rococo

Jul 2, 4:02 AM5providers, fastest on left

Ribbonfish and Tautog in Manila, Philippines in the style of rococo

Modular FLUX.2 [dev], Modular2.4s

Ribbonfish and Tautog in Manila, Philippines in the style of rococo

FLUX.2 [dev], DeepInfra6.2s

Ribbonfish and Tautog in Manila, Philippines in the style of rococo

fal.ai FLUX.2 [dev], fal.ai8.4s

Ribbonfish and Tautog in Manila, Philippines in the style of rococo

Runware FLUX.2 [dev], Runware10.3s

Ribbonfish and Tautog in Manila, Philippines in the style of rococo

FLUX.2 [dev], Replicate41.2s

Pike and Dugong in Khartoum, Sudan in the style of art deco

Jul 1, 7:08 PM5providers, fastest on left

Pike and Dugong in Khartoum, Sudan in the style of art deco

Modular FLUX.2 [dev], Modular2.1s

Pike and Dugong in Khartoum, Sudan in the style of art deco

fal.ai FLUX.2 [dev], fal.ai6.0s

Pike and Dugong in Khartoum, Sudan in the style of art deco

FLUX.2 [dev], DeepInfra6.2s

Pike and Dugong in Khartoum, Sudan in the style of art deco

Runware FLUX.2 [dev], Runware9.7s

Pike and Dugong in Khartoum, Sudan in the style of art deco

FLUX.2 [dev], Replicate17.4s

Chine and Feeler in Moscow, Russia in the style of modernist

Jul 1, 1:09 PM4providers, fastest on left

Chine and Feeler in Moscow, Russia in the style of modernist

Modular FLUX.2 [dev], Modular2.1s

Chine and Feeler in Moscow, Russia in the style of modernist

FLUX.2 [dev], Replicate6.9s

Chine and Feeler in Moscow, Russia in the style of modernist

fal.ai FLUX.2 [dev], fal.ai10.1s

Chine and Feeler in Moscow, Russia in the style of modernist

FLUX.2 [dev], DeepInfra12.2s

Speed Analysis

API Generation Time, Variance

Generation time: seconds to generate 1 image · Results by percentile · Lower is better

Median time the provider takes to generate an image over the past 14 days of measurements. This includes downloading the image from the provider where a URL is provided rather than an image response.

Picture of the author

API Generation Time, Over Time

Generation time: seconds to generate 1 image · Lower is better

Median time the provider takes to generate an image over the past 14 days of measurements. This includes downloading the image from the provider where a URL is provided rather than an image response.

Each point is the median of a 14-day window of measurements, based on 1 measurement each day at different times.

Similar Articles

@Modular: Day two of AI Engineer World's Fair @aiDotEngineer! Meet the team at booth U-G28 to talk all things Mojo, MAX, and Modu…

X AI KOLs Timeline

Modular is at the AI Engineer World's Fair demonstrating Mojo, MAX, and Modular Cloud with fast image generation using FLUX.2 on DGX Spark, and real-time video generation.

@Modular: .@hippocraticai runs 400B+ parameter models for real-time patient conversations, tens of thousands per day. When they b…

X AI KOLs Following

Hippocratic AI partners with Modular to use MAX framework for inference on large language models, achieving sub-500ms TTFT, ~30% faster P99 latency and ~22% faster mean latency at scale on NVIDIA B300 GPUs, with portability to AMD.

@Modular: Our kernel team has been deep in MiniMax M3 all week. The 1M-token context and native multimodality make it a hard mode…

X AI KOLs Following

Modular's kernel team is optimizing serving for MiniMax M3's 1M-token context and native multimodality, with open weights dropping soon for immediate deployment on Modular.

@Modular: .@zai_org open-sourced GLM 5.2 today, and Modular is a Day Zero launch partner. GLM 5.2 is their new flagship for codin…

X AI KOLs Following

Zhipu AI (zai_org) has open-sourced GLM 5.2, a flagship model for coding and long-horizon agentic tasks with a usable 1M-token context. Modular is a Day Zero launch partner, offering optimized serving on Modular Cloud.

Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance

Hugging Face Daily Papers

Moebius is a 0.22B parameter image inpainting framework that rivals 10B-level models like FLUX.1-Fill-Dev, achieving over 15x faster inference through novel local-global interaction blocks and adaptive distillation strategies.

Submit Feedback