Reddit

How to build an AI team?

Reddit r/AI_Agents ↗ · 11h ago

This article outlines essential best practices for deploying and monitoring AI agent teams, stressing precise job definitions, continuous oversight, and stable cloud infrastructure. It evaluates several agent runtimes and hosting platforms while comparing their operational costs to traditional human roles.

0 favorites 0 likes

Joscha Bach: Mapping Every Neuron Won't Give You a Mind

Reddit r/artificial ↗ · 11h ago

The article presents Joscha Bach's argument that replicating the physical wiring of the brain cannot produce human-like consciousness, emphasizing that mental states arise from information processing rather than mere anatomical mapping.

0 favorites 0 likes

Has anyone messed around with song generation using Google's Lyria 3 Pro? This was 8 cents in API credits, and the first thing I ever generated...

Reddit r/singularity ↗ · 12h ago Cached

A community member shares their hands-on experience generating a track using Google's Lyria 3 Pro via its API, noting the minimal cost and initial quality of the output.

0 favorites 0 likes

RTX Pro 4500 Blackwell - Qwen 3.6 27B?

Reddit r/LocalLLaMA ↗ · 12h ago

A developer shares local inference benchmarks and systemd configurations for running the Qwen3.6-27B model on an NVIDIA RTX Pro 4500 Blackwell GPU using llama.cpp. The post requests optimization tips for throughput and explores potential use cases for larger models.

0 favorites 0 likes

Those of you who like Gemma4 models - how are you guys using them?

Reddit r/LocalLLaMA ↗ · 12h ago

A developer shares their mixed experience running Gemma4 and Qwen locally for coding tasks, noting issues with tool integration, loop handling, and task completion while asking the community for better usage strategies.

0 favorites 0 likes

Qwen3.6 35B A3B uncensored heretic Native MTP Preserved is Out Now With KLD 0.0015, 10/100 Refusals and the Full 19 MTPs Preserved and Retained, Available in Safetensors, GGUFs. NVFP4, NVFP4 GGUFs and GPTQ-Int4 Formats

Reddit r/LocalLLaMA ↗ · 13h ago

Community release of Qwen3.6 35B A3B uncensored variant with full 19 MTP tensors preserved, available in multiple formats including Safetensors, GGUF, NVFP4 and GPTQ-Int4.

0 favorites 0 likes

METR evaluated an early version of Claude Mythos

Reddit r/singularity ↗ · 13h ago

METR evaluated an early version of Claude Mythos Preview in March 2026 using their time-horizons task suite, estimating a 50%-time-horizon of at least 16 hours, indicating the model is at the upper end of what current benchmarks can measure, with caveats about stability at longer time ranges.

0 favorites 0 likes

First Native Color Lidar Sensor by Ouster (REV8), where color and 3D data are fused in silicon and not in software

Reddit r/singularity ↗ · 15h ago

Ouster announces REV8, the first native color lidar sensor that fuses color and 3D data directly in silicon rather than in software, marking a hardware-level advancement in 3D sensing technology.

0 favorites 0 likes

AI gives us the 80s TV show we should have had

Reddit r/singularity ↗ · 15h ago

Article discusses AI being used to create an 80s-style TV show that would have fit that era.

0 favorites 0 likes

Joscha Bach: Why Mind Uploading Probably Won't Work

Reddit r/singularity ↗ · 15h ago

Joscha Bach discusses the technical and philosophical challenges that make mind uploading an unlikely feasibility, exploring the complexities of consciousness and substrate independence.

0 favorites 0 likes

voice agents should know you even before your first interaction

Reddit r/AI_Agents ↗ · 15h ago

Developer built a Pipecat plugin integrating Onairos preference model to preload user profiles before voice agent interactions, reducing time-to-useful from 3 minutes to 1:30 by eliminating warmup discovery questions.

0 favorites 0 likes

MTP is all about acceptance rate

Reddit r/LocalLLaMA ↗ · 15h ago

A user benchmarked MTP (Multi-Token Prediction) on Gemma 4 with mlx-vlm on M4 Max Studio, finding it excellent for code generation (1.53x faster, 66% acceptance) but detrimental for JSON output (50% slower, only 8% acceptance) and neutral for long-form prose, suggesting MTP benefits vanish when acceptance drops below 50%.

1 favorites 1 likes

I built a benchmark for AI “memory” in coding agents. looking for others to beat it.

Reddit r/artificial ↗ · 16h ago

Developer created a new benchmark called continuity-benchmarks to test AI coding agents' ability to maintain consistency with project rules during active development, addressing gaps in existing memory benchmarks that focus on semantic recall rather than real-time architectural consistency and multi-session behavior.

0 favorites 0 likes

I took Meta's TRIBE v2 brain model and made it watch YouTube in real time

Reddit r/ArtificialInteligence ↗ · 16h ago

A developer built a real-time AI character that watches YouTube videos and reacts using Meta's TRIBE v2 brain model to predict cortical responses, wrapping the neural signal into a voiced 3D avatar that comments on content.

0 favorites 0 likes

Qwen 35B-A3B is very usable with 12GB of VRAM

Reddit r/LocalLLaMA ↗ · 16h ago

A user benchmarks Qwen 35B-A3B (a 35B MoE model) on a 12GB RTX 3060, finding that 12GB VRAM is a practical sweet spot for running the model with 32k context, achieving ~47 t/s generation.

0 favorites 0 likes

Got MTP + TurboQuant running — Qwen3.6-27B -- 80+ t/s at 262K context on a single RTX 4090

Reddit r/LocalLLaMA ↗ · 16h ago

Developer achieved 80+ t/s inference on Qwen3.6-27B with 262K context on a single RTX 4090 by combining MTP (Multi-Token Prediction) with TurboQuant's lossless KV cache compression, sharing their implementation fork and technical details.

1 favorites 1 likes

new MoE from ai2, EMO

Reddit r/LocalLLaMA ↗ · 17h ago

AI2 released EMO, a Mixture of Experts language model with 1B active parameters out of 14B total, trained on 1 trillion tokens and featuring document-level routing where experts cluster around domains.

0 favorites 1 likes

How difficult is distilling?

Reddit r/LocalLLaMA ↗ · 17h ago

该文章探讨了模型蒸馏的难度和成本，以DeepSeek R1蒸馏到Llama 3 8b和Qwen 2.5 7b为例，询问为何蒸馏模型不常见。

0 favorites 0 likes

"At what point does adding another agent actually hurt your system? Asking because my 6-agent pipeline is slower and less reliable than my old 2-agent one

Reddit r/AI_Agents ↗ · 17h ago

A developer shares real-world experiences with AI orchestration frameworks (LangGraph, CrewAI, AutoGen), noting trade-offs between ease of prototyping and production reliability, and asks the community about handling failures, human-in-the-loop, and token costs.

0 favorites 0 likes

I kept losing agent memory between sessions, so I built a memory broker that isolates per-agent and survives restarts

Reddit r/AI_Agents ↗ · 17h ago

The author built HeurChain, a memory broker that provides agent-specific, persistent memory storage for AI agents, surviving restarts and supporting structured and semantic retrieval.

0 favorites 0 likes

Reddit

Submit Feedback