@pcuenq: My data point: working on two projects in parallel with Pi + llama.cpp + Qwen-3.6-35B-A3B (I prefer the MoE ) This work…

X AI KOLs Timeline News

Summary

A user reports successfully running parallel projects using Pi and llama.cpp with the Qwen-3.6-35B-A3B model on an older M1 Max machine, demonstrating practical usability.

My data point: working on two projects in parallel with Pi + llama.cpp + Qwen-3.6-35B-A3B (I prefer the MoE ) This works on my M1 Max (64 GB), which I bought 4.5 years ago. "Works" as in "you can get work done", not just "runs for a demo". https://x.com/julien_c/status/2047647522173104145?s=20…
Original Article

Similar Articles

More Qwen3.6-27B MTP success but on dual Mi50s

Reddit r/LocalLLaMA

The article benchmarks the Qwen3.6-27B model using Multi-Token Prediction (MTP) and tensor parallelism on dual Mi50 GPUs, demonstrating significant speedups via llama.cpp.