@pcuenq: My data point: working on two projects in parallel with Pi + llama.cpp + Qwen-3.6-35B-A3B (I prefer the MoE ) This work…

X AI KOLs Timeline 05/12/26, 09:24 AM News

Summary

A user reports successfully running parallel projects using Pi and llama.cpp with the Qwen-3.6-35B-A3B model on an older M1 Max machine, demonstrating practical usability.

My data point: working on two projects in parallel with Pi + llama.cpp + Qwen-3.6-35B-A3B (I prefer the MoE ) This works on my M1 Max (64 GB), which I bought 4.5 years ago. "Works" as in "you can get work done", not just "runs for a demo". https://x.com/julien_c/status/2047647522173104145?s=20…

Original Article

Similar Articles

Running Qwen3.6-35B-A3B Locally for Coding Agent: My Setup & Working Config

Reddit r/LocalLLaMA

A detailed guide for running the 35B-parameter Qwen3.6 model locally on Apple Silicon with llama.cpp to power the pi coding agent, including optimized configuration flags and sampling parameters.

@mitsuhiko: If you don't have a 128GB mac, I also have a pi-llamacpp extension that just configures 4 versions of Qwen 3.6. https:/…

X AI KOLs Timeline

mitsuhiko releases a pi-llamacpp extension that automates the setup and management of local LLM inference using llama.cpp, specifically supporting various quantized versions of the Qwen 3.6 model.

@_lewtun: You can now have an AI researcher running on your laptop 24/7 for free! Running Qwen3-35B-A3B with llama.cpp and a 4-bi…

X AI KOLs Timeline

The article highlights the ability to run Qwen3-35B-A3B locally on a laptop for free using llama.cpp and Unsloth 4-bit quantization.

More Qwen3.6-27B MTP success but on dual Mi50s

Reddit r/LocalLLaMA

The article benchmarks the Qwen3.6-27B model using Multi-Token Prediction (MTP) and tensor parallelism on dual Mi50 GPUs, demonstrating significant speedups via llama.cpp.

@leopardracer: https://x.com/leopardracer/status/2055341758523883631

X AI KOLs Timeline

A user shares their experience setting up a dual-GPU local AI lab with RTX 4080 Super and 5060 Ti, running Qwen 3.6 models via llama.cpp and llama-swap to reduce API costs and enable unrestricted experimentation.

Similar Articles

Running Qwen3.6-35B-A3B Locally for Coding Agent: My Setup & Working Config

@mitsuhiko: If you don't have a 128GB mac, I also have a pi-llamacpp extension that just configures 4 versions of Qwen 3.6. https:/…

@_lewtun: You can now have an AI researcher running on your laptop 24/7 for free! Running Qwen3-35B-A3B with llama.cpp and a 4-bi…

More Qwen3.6-27B MTP success but on dual Mi50s

@leopardracer: https://x.com/leopardracer/status/2055341758523883631

Submit Feedback