nvidia-b300

#nvidia-b300

@ollama: GLM 5.2 on Ollama's cloud just doubled GPU capacity to handle the volume of usage! This is all US based, and running on…

X AI KOLs Following ↗ · 2d ago Cached

Ollama doubled GPU capacity for GLM 5.2 on its US cloud, using NVIDIA B300 Blackwell GPUs, emphasizing privacy and open models.

0 favorites 0 likes

#nvidia-b300

@Modular: .@hippocraticai runs 400B+ parameter models for real-time patient conversations, tens of thousands per day. When they b…

X AI KOLs Following ↗ · 2026-06-11 Cached

Hippocratic AI partners with Modular to use MAX framework for inference on large language models, achieving sub-500ms TTFT, ~30% faster P99 latency and ~22% faster mean latency at scale on NVIDIA B300 GPUs, with portability to AMD.

0 favorites 0 likes

nvidia-b300

@ollama: GLM 5.2 on Ollama's cloud just doubled GPU capacity to handle the volume of usage! This is all US based, and running on…

@Modular: .@hippocraticai runs 400B+ parameter models for real-time patient conversations, tens of thousands per day. When they b…

Submit Feedback