Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7

Simon Willison's Blog News

Summary

Simon Willison compares Qwen3.6-35B-A3B running locally on a MacBook Pro against Claude Opus 4.7, finding that Qwen produces better SVG illustrations of pelicans riding bicycles and flamingos on unicycles, though he notes this narrow benchmark doesn't reflect broader model capabilities.

No content available
Original Article
View Cached Full Text

Cached at: 04/20/26, 08:27 AM

# Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7 Source: https://simonwillison.net/2026/Apr/16/qwen-beats-opus/ 16th April 2026 For anyone who has been \(inadvisably\) taking mypelican riding a bicycle benchmark (https://simonwillison.net/tags/pelican-riding-a-bicycle/)seriously as a robust way to test models, here are pelicans from this morning’s two big model releases—Qwen3\.6\-35B\-A3B from Alibaba (https://qwen.ai/blog?id=qwen3.6-35b-a3b)andClaude Opus 4\.7 from Anthropic (https://www.anthropic.com/news/claude-opus-4-7)\. Here’s the Qwen 3\.6 pelican, generated usingthis 20\.9GB Qwen3\.6\-35B\-A3B\-UD\-Q4\_K\_S\.gguf (https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF/blob/main/Qwen3.6-35B-A3B-UD-Q4_K_S.gguf)quantized model by Unsloth, running on my MacBook Pro M5 viaLM Studio (https://lmstudio.ai/)\(and thellm\-lmstudio (https://github.com/agustif/llm-lmstudio)plugin\)—transcript here (https://gist.github.com/simonw/4389d355d8e162bc6e4547da214f7dd2): The bicycle frame is the correct shape. There are clouds in the sky. The pelican has a dorky looking pouch. A caption on the ground reads Pelican on a Bicycle! And here’s one I got from Anthropic’sbrand new Claude Opus 4\.7 (https://www.anthropic.com/news/claude-opus-4-7)\(transcript (https://gist.github.com/simonw/afcb19addf3f38eb1996e1ebe749c118)\): The bicycle frame is entirely the wrong shape. No clouds, a yellow sun. The pelican is looking behind itself, and has a less pronounced pouch than I would like. I’m giving this one to Qwen 3\.6\. Opus managed to mess up the bicycle frame\! I tried Opus a second time passing`thinking\_level: max`\. It didn’t do much better \(transcript (https://gist.github.com/simonw/7566e04a81accfb9affda83451c0f363)\): The bicycle frame is entirely the wrong shape but in a different way. Lines are more bold. Pelican looks a bit more like a pelican. #### I don’t think Qwen are cheating A lot of people areconvinced that the labs train for my stupid benchmark (https://simonwillison.net/2025/Nov/13/training-for-pelicans-riding-bicycles/)\. I don’t think they do, but honestly this result did give me a little glint of suspicion\. So I’m burning one of my secret backup tests—here’s what I got from Qwen3\.6\-35B\-A3B and Opus 4\.7 for “Generate an SVG of a flamingo riding a unicycle”: I’m giving this one to Qwen too, partly for the excellent`<\!\-\- Sunglasses on flamingo\! \-\-\>`SVG comment\. #### What can we learn from this? The pelican benchmark has always been meant as a joke—it’s mainly a statement on how obtuse and absurd the task of comparing these models is\. The weird thing about that joke is that, for the most part, there has been a direct correlation between the quality of the pelicans produced and the general usefulness of the models\. Thosefirst pelicans from October 2024 (https://simonwillison.net/2024/Oct/25/pelicans-on-a-bicycle/)were junk\. Themore recent entries (https://simonwillison.net/tags/pelican-riding-a-bicycle/)have generally been much, much better—to the point that Gemini 3\.1 Pro producesillustrations you could actually use somewhere (https://simonwillison.net/2026/Feb/19/gemini-31-pro/), provided you had a pressing need to illustrate a pelican riding a bicycle\. Today, even that loose connection to utility has been broken\. I have enormous respect for Qwen, but I very much doubt that a 21GB quantized version of their latest model is more powerful or useful than Anthropic’s latest proprietary release\. If the thing you need is an SVG illustration of a pelican riding a bicycle though, right now Qwen3\.6\-35B\-A3B running on a laptop is a better bet than Opus 4\.7\!

Similar Articles

Switching from Opus 4.7 to Qwen-35B-A3B

Reddit r/LocalLLaMA

Community discussion about switching from Claude Opus 4.7 to Qwen-35B-A3B for a coding agent use case, seeking user experiences and performance comparisons.

Qwen 3.6 35B A3B vs Qwen 3.5 122B A10B

Reddit r/LocalLLaMA

User reports Qwen 3.5 122B significantly outperforms Qwen 3.6 35B on multi-step tasks despite benchmark claims, questioning if quantization or setup issues are to blame.

Layman's comparison on Qwen3.6 35b-a3b and Gemma4 26b-a4b-it

Reddit r/LocalLLaMA

A user compares Qwen3.6 35B-A3B and Gemma 4 26B-A4B-IT running locally on a 16GB VRAM GPU via LM Studio, finding Qwen3.6 produces more detailed outputs while both run at comparable speeds. The post is an informal community comparison using quantized models.

The Qwen 3.6 35B A3B hype is real!!!

Reddit r/LocalLLaMA

The author benchmarks small local LLMs, highlighting Qwen 3.6 35B A3B for its superior ability to map academic code to research papers compared to models like Gemma 4 and Nemotron 3 Nano.

Qwen3.6-35B-A3B-Abliterated-Heretic-MLX-4bit

Reddit r/LocalLLaMA

The user reviews a quantized and fine-tuned version of the Qwen3.6-35B model optimized for Apple Silicon via MLX, praising its speed, intelligence, and lack of safety disclaimers.