audio

#audio

Introducing Gemma 4 12B: a unified, encoder-free multimodal model

Google DeepMind Blog ↗ · 13h ago Cached

Google DeepMind announces Gemma 4 12B, a novel encoder-free multimodal AI model that integrates vision and audio directly into the LLM backbone, delivering advanced reasoning and agentic capabilities on laptops with 16GB of RAM, released under Apache 2.0 license.

0 favorites 0 likes

#audio

Teenage Engineering: Introducing APC-2

Hacker News Top ↗ · 2d ago Cached

Teenage Engineering announces APC-2, a professional audio disc recording system for cutting vinyl records in real time, built in collaboration with SUPERSENSE.

0 favorites 0 likes

#audio

@victormustar: Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drop…

X AI KOLs Following ↗ · 4d ago Cached

A recap of an extraordinary week in open AI, featuring over 25 open-weight model releases across LLMs, image generation, audio/speech, vision, and video/3D, with notable contributions from NVIDIA, Google, and others.

0 favorites 0 likes

#audio

New Texas Instruments 5532 chips are not the 5532s we’ve used for decades

Hacker News Top ↗ · 6d ago

Texas Instruments has released new 5532 chips that differ from the classic versions used for decades, potentially impacting audio applications.

0 favorites 0 likes

#audio

4 Best Alexa Speakers (2026): Echo Dot Max, Echo Dot, Echo Show 11

Wired ↗ · 2026-06-02 Cached

Wired reviews four best Alexa speakers and smart displays for 2026, highlighting the Echo Show 11 as the top smart display and the Echo Show 8 (3rd gen) as the best affordable option, with mentions of ads and speaker quality trade-offs.

0 favorites 0 likes

#audio

@Xiaomi: Xiaomi Sound Play. Small in your palm. Big in every moment. 18W powerful sound. Colorful lighting effects. 14-hour batt…

X AI KOLs Following ↗ · 2026-05-28 Cached

Xiaomi announces Sound Play, a compact portable speaker with 18W output, colorful lighting, 14-hour battery, and IP68 durability.

0 favorites 0 likes

#audio

@Xiaomi: That's #XiaomiBuds6. A comfortable semi-in-ear fit, richer sound, clearer calls, and smarter everyday convenience.

X AI KOLs Following ↗ · 2026-05-28 Cached

Xiaomi announces the Buds 6, featuring a comfortable semi-in-ear fit, richer sound, clearer calls, and smarter everyday convenience.

0 favorites 0 likes

#audio

ChildVox: A Speech, Audio, and Large Audio-Language Model Benchmark in Understanding and Characterizing Sound across Childhood

Hugging Face Daily Papers ↗ · 2026-05-28 Cached

ChildVox presents a comprehensive benchmark for analyzing children's acoustic communication across developmental stages, integrating over 20 sub-tasks from 17 child-centered audio and speech datasets.

0 favorites 0 likes

#audio

Cearvol’s Wave Design Fights Off Hearing Loss and Aging Stigma

Wired ↗ · 2026-05-26 Cached

The Cearvol Wave Lite earbuds offer moderate hearing assistance but fall short in audio quality, especially for conversation and movie-watching, though they are reasonably priced for the hearing aid market.

0 favorites 0 likes

#audio

Show HN: Audiomass – a free, open-source multitrack audio editor for the web

Hacker News Top ↗ · 2026-05-24

Audiomass is a free, open-source multitrack audio editor that runs entirely in the web browser.

0 favorites 0 likes

#audio

The mysterious XF86AudioPlay issue

Lobsters Hottest ↗ · 2026-05-22 Cached

A blog post detailing the debugging of a recurring XF86AudioPlay key event in Emacs, traced to a headphone device driver using libinput and evtest.

0 favorites 0 likes

#audio

Marshall brings ANC back to its smaller on-ear wireless headphones

The Verge ↗ · 2026-05-19 Cached

Marshall announces the Milton A.N.C., a new pair of on-ear wireless headphones with active noise cancellation, available for $229.99. It offers up to 80 hours of playtime without ANC, Bluetooth 6.0, spatial audio, and a replaceable battery.

0 favorites 0 likes

#audio

loopmaster – Livecoding Music IDE

Hacker News Top ↗ · 2026-05-18

loopmaster is an IDE for livecoding music, enabling real-time algorithmic music composition.

0 favorites 0 likes

#audio

Jokes aside this just looks and sounds way too well done

Reddit r/ArtificialInteligence ↗ · 2026-05-18

A comment praising a product or demo for its high-quality appearance and sound.

0 favorites 0 likes

#audio

Leaked images reveal Sony’s 10th anniversary ‘ColleXion’ headphones

The Verge ↗ · 2026-05-18 Cached

Leaked images and details reveal Sony's upcoming 10th anniversary ColleXion headphones, featuring premium design, updated audio drivers, and a $649 price tag, expected to launch May 19th.

0 favorites 0 likes

#audio

AudioMosaic: Contrastive Masked Audio Representation Learning

arXiv cs.LG ↗ · 2026-05-15 Cached

AudioMosaic introduces a contrastive learning-based audio encoder that uses structured time-frequency masking on spectrogram patches for efficient large-batch training, achieving state-of-the-art performance on audio benchmarks and improving audio-language models.

0 favorites 0 likes

#audio

Testing an agent skill that turns prompts into audio courses and lets you publish to Spotify

Reddit r/AI_Agents ↗ · 2026-05-13

The author describes testing an agent workflow that converts prompts into audio courses for publishing to Spotify, with potential uses like meeting briefings, team updates, and study notes.

0 favorites 0 likes

#audio

@OpenAI: Listen to the OpenAI Podcast on— Spotify https://open.spotify.com/show/0zojMEDizKMh3aTxnGLENP… Apple https://podcasts.a…

X AI KOLs ↗ · 2026-04-17

OpenAI announces the availability of their podcast on major streaming platforms including Spotify, Apple Podcasts, and YouTube.

0 favorites 0 likes

#audio

Socrati

Product Hunt ↗ · 2026-04-14

Socrati is a new product launching on Product Hunt that generates personal knowledge podcasts from various sources.

0 favorites 0 likes

#audio

OmniGUI: Benchmarking GUI Agents in Omni-Modal Smartphone Environments

Hugging Face Daily Papers ↗ · 2026-04-03 Cached

OmniGUI introduces a step-level benchmark for GUI agents that integrates static images, synchronous audio, and video clips to simulate real smartphone interactions. Evaluation shows current models struggle with temporal and auditory inputs, highlighting the need for omni-modal capabilities.

0 favorites 0 likes

audio

Submit Feedback