high-throughput

Tag

Cards List
#high-throughput

What I learned building low latency and high throughput AI agents

Reddit r/AI_Agents · 2026-06-05

The article shares practical lessons for building low-latency, high-throughput AI agents, including workload estimation, token reduction, parallelism, microservices, and handling LLM failures.

0 favorites 0 likes
#high-throughput

@HotAisle: This is awesome. I wonder who's MI300x they used... ;-)

X AI KOLs Following · 2026-05-29 Cached

Kog announces real-time LLM inference achieving 3000+ output tokens per second per request on standard datacenter GPUs, bringing high-speed inference previously limited to custom silicon to production hardware.

0 favorites 0 likes
#high-throughput

@arcinstitute: Because PerturbSpace uses standard single-cell sequencing, it's compatible with any single-cell readout. In one day, th…

X AI KOLs Timeline · 2026-05-26 Cached

Arc Institute's PerturbSpace enables high-throughput single-cell profiling of transcriptome, location, CRISPR guides, clonal relationships, and surface proteins from many samples in one day, using standard single-cell sequencing.

0 favorites 0 likes
#high-throughput

Holotron-12B - High Throughput Computer Use Agent

Hugging Face Blog · 2026-03-17 Cached

H Company releases Holotron-12B, a multimodal computer-use agent optimized for high-throughput inference using a hybrid SSM architecture. The model, post-trained on NVIDIA Nemotron, demonstrates superior efficiency and scalability for interactive agentic workloads.

0 favorites 0 likes
#high-throughput

Gemini 3.1 Flash-Lite: Built for intelligence at scale

Google DeepMind Blog · 2026-03-03 Cached

Google introduces Gemini 3.1 Flash-Lite, a high-speed, cost-efficient AI model available in preview via Google AI Studio and Vertex API, designed for high-volume developer workloads.

0 favorites 0 likes
← Back to home

Submit Feedback