realtime

Tag

Cards List
#realtime

Realtime voice models compounds on cost (and forgets)- "Flowcat" fixed both (4x cheaper, 7x more context)

Reddit r/AI_Agents · 6d ago

Flowcat addresses the high cost and limited context of realtime voice models, achieving 4x lower cost and 7x more context.

0 favorites 0 likes
#realtime

@VikParuchuri: We're launching turbo mode data extraction - 5x faster, 5x cheaper, and 7% more accurate than Azure Content Understandi…

X AI KOLs Following · 2026-06-17 Cached

VikParuchuri announces the launch of turbo mode data extraction, claiming 5x faster and cheaper performance with 7% more accuracy than Azure Content Understanding, achieving competitive latency for real-time workflows.

0 favorites 0 likes
#realtime

@mudler_it: parakeet.cpp now runs NVIDIA Parakeet behind the OpenAI API. Point any OpenAI client at a local server, send an audio, …

X AI KOLs Timeline · 2026-06-17 Cached

parakeet.cpp enables running NVIDIA Parakeet ASR behind the OpenAI API locally with prebuilt Docker images, supporting CPU and CUDA (including arm64) for real-time transcription with word timestamps.

0 favorites 0 likes
#realtime

OpenAI WebRTC Audio Session, now with document context

Simon Willison's Blog · 2026-06-12 Cached

Simon Willison updates his OpenAI WebRTC Audio Session tool to support the new GPT-Realtime-2 model and adds document context for conversational audio discussions.

0 favorites 0 likes
#realtime

@kwindla: https://x.com/kwindla/status/2062544580105359686

X AI KOLs Timeline · 2026-06-04 Cached

NVIDIA released Nemotron 3.5 ASR, an open-source multilingual speech-to-text model with the lowest latency tested, available in multilingual and English-only variants, ideal for voice agents and self-hosted deployments.

0 favorites 0 likes
#realtime

@hwwaanng: What I'm posting today is different from before — it's very interesting and experimental. A few friends and I are huge fans of Vibe Coding. One day, we thought: what if you could talk to Codex in real time, then you could also Vibe Code while driving? So...

X AI KOLs Timeline · 2026-05-31 Cached

The author introduces an experimental project, Hey Codex, a real-time conversational version of Codex that allows users to interact with Codex via voice for Vibe Coding in scenarios like driving.

0 favorites 0 likes
#realtime

@0xCortexl: SOMEONE FROM TOKYO IS MAPPING BIRD LANGUAGE INTO REAL DATA PATTERNS AND THE VISUALIZATION LOOKS LIKE A NEURAL NETWORK D…

X AI KOLs Timeline · 2026-05-24 Cached

An AI system maps bird vocalizations into 3D visualizations, converting frequency and modulation data into colored point clusters in real time, with potential applications in industrial and medical anomaly detection.

0 favorites 0 likes
#realtime

Rewriting model inference with CUDA kernels: the bottleneck was not just GEMM [P]

Reddit r/MachineLearning · 2026-05-18

Author describes building FlashRT, a CUDA-first inference runtime that rewrites model inference paths with C++/CUDA kernels to address bottlenecks beyond GEMM for small-batch/realtime workloads, achieving significant latency improvements on Jetson Thor and RTX 5090. The article discusses lessons on precision (FP8 helpful, FP4 mixed) and the need to bypass generic runtimes for realtime inference.

0 favorites 0 likes
#realtime

@dgwyer: You can now edit SolveIt messages via realtime conversational voice, and have the diff edits optionally tracked.

X AI KOLs Following · 2026-05-15 Cached

SolveIt now supports editing messages via conversational voice with optional diff tracking.

0 favorites 0 likes
#realtime

Fully Realtime Interaction Models

Reddit r/LocalLLaMA · 2026-05-14

Discussion of an upcoming fully realtime interaction model that will be released via API, with plans to create distillation data from it.

0 favorites 0 likes
#realtime

@seclink: OpenAI Launches GPT-Realtime-2, Its Most Intelligent Voice Model to Date. The model features GPT-5-level reasoning, a 128,000 token context window, and supports adjusting 'effort level' for more natural conversation. It can pair with GPT-R…

X AI KOLs Following · 2026-05-08

OpenAI released the GPT-Realtime-2 voice model, featuring GPT-5-level reasoning capabilities and a 128,000 token context window. It supports real-time translation from over 70 input languages to 13 output languages, achieving 96.6% accuracy on the Big Bench Audio Intelligence benchmark. Greg Brockman called it a milestone in voice translation.

0 favorites 0 likes
#realtime

@kwindla: OpenAI shipped a new speech-to-speech model today: gpt-realtime-2 This is the first speech-to-speech model good enough …

X AI KOLs Following · 2026-05-07

OpenAI has released gpt-realtime-2, a new speech-to-speech model optimized for real-time voice agent interactions with low-latency tool calling.

0 favorites 0 likes
#realtime

A type-safe, realtime collaborative Graph Database in a CRDT

Hacker News Top · 2026-04-21 Cached

Codemix open-sources @codemix/graph, a type-safe, CRDT-backed graph database with TypeScript-native schema validation and realtime offline-first sync via Yjs.

0 favorites 0 likes
#realtime

blakeblackshear/frigate

GitHub Trending (daily) · 2026-05-24 Cached

Frigate is an open-source NVR designed for Home Assistant that performs real-time AI object detection on IP camera feeds locally using OpenCV and TensorFlow. It features tight Home Assistant integration, motion-based detection, and efficient resource usage.

0 favorites 0 likes
#realtime

We’re introducing three audio models in the API

YouTube AI Channels · 2026-05-08 Cached

OpenAI has launched three real-time audio models in the API, including a real-time translation model GPT Realtime Translate that supports 70 languages and a voice agent GPT Realtime 2 with reasoning capabilities, enabling developers to build more natural voice interaction interfaces.

0 favorites 0 likes
← Back to home

Submit Feedback