Models

@heyrobinai: THE ENTIRE AI INDUSTRY JUST GOT HUMILIATED a tiny model trained in just a few hours on a single graphics card is planni…

X AI KOLs Timeline ↗ · yesterday

Yann LeCun's team releases LeWorldModel, a tiny 15M-parameter physics model trained on a single GPU in hours that outperforms billion-dollar foundation models in planning speed and physical plausibility, challenging the dominant scaling paradigm.

0 favorites 0 likes

HiDream-ai/HiDream-O1-Image

Hugging Face Models Trending ↗ · yesterday Cached

HiDream-ai has open-sourced HiDream-O1-Image (8B), a unified image generative foundation model built on a Pixel-level Unified Transformer (UiT) that natively handles text-to-image, image editing, and subject-driven personalization at up to 2048×2048 resolution without external VAEs or disjoint text encoders. It debuted at #8 in the Artificial Analysis Text to Image Arena and is positioned as a leading open-weights text-to-image model.

0 favorites 0 likes

OpenAI's New Voice Models Want to Do More Than Talk Back

Reddit r/ArtificialInteligence ↗ · yesterday Cached

OpenAI has launched three new real-time audio models to enable continuous, multitasking voice interactions that prioritize long-context reasoning, live translation, and seamless tool use.

0 favorites 0 likes

@paulabartabajo_: Advice for AI engineers If you're building voice agents, stop wiring up 3 separate models, for audio-to-text, text-to-a…

X AI KOLs Timeline ↗ · yesterday Cached

Announces liquid-audio, an open-source repository for Liquid AI's end-to-end speech-to-speech LFM models (LFM2-Audio-1.5B and LFM2.5-Audio-1.5B) with interleaved and sequential generation modes and fine-tuning support.

0 favorites 0 likes

MemReranker: Reasoning-Aware Reranking for Agent Memory Retrieval

arXiv cs.CL ↗ · yesterday Cached

MemReranker is a reasoning-aware reranking model family (0.6B/4B) designed for agent memory retrieval, addressing limitations in semantic similarity by incorporating LLM knowledge distillation for better temporal and causal reasoning.

0 favorites 0 likes

@FinanceYF5: 1/ Voice Agent Upgraded: OpenAI Launches GPT-Realtime-2, Bringing GPT-5-Level Reasoning to Real-Time Voice API. Voice assistants no longer just "understand and respond" — they can think while listening, and problem-solve while chatting.

X AI KOLs Following ↗ · yesterday Cached

OpenAI has launched GPT-Realtime-2, integrating GPT-5-level reasoning into the real-time voice API, enabling voice assistants to think and solve problems in real time during conversations.

0 favorites 0 likes

@gdb: GPT-5.5-Cyber is now in limited preview for defenders for securing critical infrastructure. It's a very capable model.

X AI KOLs Following ↗ · yesterday Cached

GPT-5.5-Cyber is now in limited preview for defenders, offering a capable model for securing critical infrastructure.

0 favorites 0 likes

@seclink: OpenAI Launches GPT-Realtime-2, Its Most Intelligent Voice Model to Date. The model features GPT-5-level reasoning, a 128,000 token context window, and supports adjusting 'effort level' for more natural conversation. It can pair with GPT-R…

X AI KOLs Following ↗ · yesterday

OpenAI released the GPT-Realtime-2 voice model, featuring GPT-5-level reasoning capabilities and a 128,000 token context window. It supports real-time translation from over 70 input languages to 13 output languages, achieving 96.6% accuracy on the Big Bench Audio Intelligence benchmark. Greg Brockman called it a milestone in voice translation.

0 favorites 0 likes

@billtheinvestor: Shanghai Jiao Tong University open-sources F5-TTS speech generation model. The model is trained on 100,000 hours of data and supports bilingual synthesis in Chinese and English. Technical features include zero-shot voice cloning, total-duration-based speed control, emotion expression control, and long text synthesis. Commercial use is allowed.

X AI KOLs Timeline ↗ · yesterday Cached

Shanghai Jiao Tong University has open-sourced the F5-TTS speech generation model, trained on 100,000 hours of data, supporting bilingual synthesis in Chinese and English and zero-shot voice cloning, and allowing commercial use.

1 favorites 1 likes

Qwen3.6-35B-A3B-Abliterated-Heretic-MLX-4bit

Reddit r/LocalLLaMA ↗ · yesterday

The user reviews a quantized and fine-tuned version of the Qwen3.6-35B model optimized for Apple Silicon via MLX, praising its speed, intelligence, and lack of safety disclaimers.

0 favorites 0 likes

@FinanceYF5: AI agents can now 'dream'—the Dreaming feature reviews historical conversations, extracts patterns, and continuously self-optimizes. Combined with multi-agent parallel orchestration and Outcome quality assessment, Claude agents officially enter the self-evolution phase.

X AI KOLs Following ↗ · yesterday Cached

Claude agents have added a new 'Dreaming' feature that enables self-optimization by reviewing historical conversations and extracting patterns. Together with multi-agent parallel orchestration and quality assessment, this marks the transition of AI agents into a self-evolution stage.

0 favorites 0 likes

JANGQ-AI/MiniMax-M2.7-JANGTQ_K : mixed-bit quant of MiniMax M2.7 - 74 GB on disk

Reddit r/LocalLLaMA ↗ · yesterday Cached

Release of a mixed-bit quantized version of the MiniMax M2.7 model, optimized to 74 GB for efficient local inference on Apple Silicon devices.

0 favorites 0 likes

ZAYA1-74B-Preview: Scaling Pretraining on AMD

Reddit r/LocalLLaMA ↗ · yesterday Cached

Zyphra releases ZAYA1-74B-Preview, a 74-billion parameter base model trained on AMD hardware, highlighting strong pre-RL reasoning capabilities and agentic performance signals.

0 favorites 0 likes

@kwindla: OpenAI shipped a new speech-to-speech model today: gpt-realtime-2 This is the first speech-to-speech model good enough …

X AI KOLs Following ↗ · yesterday

OpenAI has released gpt-realtime-2, a new speech-to-speech model optimized for real-time voice agent interactions with low-latency tool calling.

0 favorites 0 likes

@eglyman: we trained a .35b-parameter model to navigate spreadsheets better than opus 4.6. normal corporate card company stuff.

X AI KOLs Following ↗ · yesterday Cached

A developer trained a 350M-parameter model capable of navigating spreadsheets better than Anthropic's Opus 4.6.

0 favorites 0 likes

11.67% ARC-AGI-2 Local Eval on a Single 4090: The TOPAS Recursive Architecture

Reddit r/LocalLLaMA ↗ · yesterday

The authors present TOPAS, a recursive AI architecture achieving 11.67% on ARC-AGI-2 using a single RTX 4090, aiming to demonstrate that architectural efficiency can outweigh raw compute power.

0 favorites 0 likes

@googlegemma: Gemma 4 up to 3x faster, directly in your phone! Check out the difference Speculative Decoding makes! Multi-Token Predi…

X AI KOLs Timeline ↗ · 2d ago Cached

Google's Gemma 4 achieves up to 3x faster inference speeds through speculative decoding and multi-token prediction, enabling efficient on-device deployment.

0 favorites 0 likes

@satyanadella: Great to bring GPT 5.5 Instant to M365 Copilot today. With quicker, clearer, and more accurate responses, you can get t…

X AI KOLs Following ↗ · 2d ago Cached

Satya Nadella announced the integration of GPT-5.5 Instant into M365 Copilot, Copilot Studio, and Foundry, highlighting faster and more accurate responses.

0 favorites 0 likes

@sama: people are really starting to use voice to interact with AI, especially when they have a lot of context to dump. GPT-Re…

X AI KOLs ↗ · 2d ago Cached

Sam Altman announces the release of GPT-Realtime-2 to the API, highlighting a significant advancement in voice interaction with AI for handling complex context.

0 favorites 0 likes

New Gemma 4 MTP on MLX?

Reddit r/LocalLLaMA ↗ · 2d ago

Google released Multi Token Prediction drafters for Gemma 4 to accelerate inference via speculative decoding, but support for MLX is currently unconfirmed or unavailable.

0 favorites 0 likes

Models

Submit Feedback