@Prince_Canuma: Quick update on the water situation M3 Ultra and Titan (RTX6000 Pro) seem to have recovered with little to no visible d…

X AI KOLs Timeline 05/18/26, 07:15 PM News

water-damage local-ai mlx vlm autocomplete git zed-ide

Summary

Personal update on hardware water damage recovery, showcasing MLX-VLM serving Qwen3-4B-Instruct locally on an RTX6000 Pro at ~300 tok/s for autocomplete and git commit generation via Zed IDE.

Quick update on the water situation M3 Ultra and Titan (RTX6000 Pro) seem to have recovered with little to no visible damage. The main issues are with my MacBook which is in service and Titan CPU temperatures being above avg when idling (58C up from 35C prior to water incident). Anyways, here is a video of MLX-VLM serving Qwen3-4B-Instruct on Titan (~300 tok/s) to do autocomplete and git commit message generation completely locally via Zed IDE.

Original Article

View Cached Full Text

Cached at: 05/19/26, 02:37 AM

Quick update on the water situation

M3 Ultra and Titan (RTX6000 Pro) seem to have recovered with little to no visible damage.

The main issues are with my MacBook which is in service and Titan CPU temperatures being above avg when idling (58C up from 35C prior to water incident).

Anyways, here is a video of MLX-VLM serving Qwen3-4B-Instruct on Titan (~300 tok/s) to do autocomplete and git commit message generation completely locally via Zed IDE.

Similar Articles

@Prince_Canuma: My home compute for MLX and research: • M3 Ultra — 512GB (sponsored by community + @wai_protocol) • RTX PRO 6000 — 96GB…

X AI KOLs Timeline

A researcher shares their home compute setup for MLX and AI research, featuring M3 Ultra with 512GB, RTX PRO 6000 with 96GB, and M3 Max with 96GB for model porting and stress testing.

@TeksEdge: Solved! Qwen3.6-27B-FP8 is now running on Intel Arc Pro B70! LocalMaxxing shows a working 4× Arc Pro B70 32GB run at ~5…

X AI KOLs Following

Qwen3.6-27B-FP8 model is now running on Intel Arc Pro B70 GPUs at ~50 tok/s with a vLLM bug fix, marking a significant milestone for Intel GPU local AI inference.

@Snixtp: https://x.com/Snixtp/status/2055734339346768225

X AI KOLs Timeline

A user benchmarks the MTP variant of Qwen3.6 27B against the normal version on a single RTX 3090 using llama.cpp, finding MTP offers up to 2.37x faster generation at long contexts (32k-64k) but with slower prefill and no concurrency support yet.

@tunguz: After seeing these tweets, I decided to try it out on my own old Ubuntu computer with RTX 1070 GPU (the one that I just…

X AI KOLs Following

A user reports successfully running Qwen3 8B locally on an older RTX 1070 GPU, demonstrating that modern LLMs can run on decade-old hardware with decent performance.

Found a way to cool the DGX

Reddit r/LocalLLaMA

A user reports successfully using tap water to cool a DGX server while running the Qwen3.5-122b model at high GPU utilization, maintaining safe temperatures.