Cached at:
04/20/26, 08:27 AM
# Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7
Source: https://simonwillison.net/2026/Apr/16/qwen-beats-opus/
16th April 2026
For anyone who has been \(inadvisably\) taking mypelican riding a bicycle benchmark (https://simonwillison.net/tags/pelican-riding-a-bicycle/)seriously as a robust way to test models, here are pelicans from this morning’s two big model releases—Qwen3\.6\-35B\-A3B from Alibaba (https://qwen.ai/blog?id=qwen3.6-35b-a3b)andClaude Opus 4\.7 from Anthropic (https://www.anthropic.com/news/claude-opus-4-7)\.
Here’s the Qwen 3\.6 pelican, generated usingthis 20\.9GB Qwen3\.6\-35B\-A3B\-UD\-Q4\_K\_S\.gguf (https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF/blob/main/Qwen3.6-35B-A3B-UD-Q4_K_S.gguf)quantized model by Unsloth, running on my MacBook Pro M5 viaLM Studio (https://lmstudio.ai/)\(and thellm\-lmstudio (https://github.com/agustif/llm-lmstudio)plugin\)—transcript here (https://gist.github.com/simonw/4389d355d8e162bc6e4547da214f7dd2):
The bicycle frame is the correct shape. There are clouds in the sky. The pelican has a dorky looking pouch. A caption on the ground reads Pelican on a Bicycle!
And here’s one I got from Anthropic’sbrand new Claude Opus 4\.7 (https://www.anthropic.com/news/claude-opus-4-7)\(transcript (https://gist.github.com/simonw/afcb19addf3f38eb1996e1ebe749c118)\):
The bicycle frame is entirely the wrong shape. No clouds, a yellow sun. The pelican is looking behind itself, and has a less pronounced pouch than I would like.
I’m giving this one to Qwen 3\.6\. Opus managed to mess up the bicycle frame\!
I tried Opus a second time passing`thinking\_level: max`\. It didn’t do much better \(transcript (https://gist.github.com/simonw/7566e04a81accfb9affda83451c0f363)\):
The bicycle frame is entirely the wrong shape but in a different way. Lines are more bold. Pelican looks a bit more like a pelican.
#### I don’t think Qwen are cheating
A lot of people areconvinced that the labs train for my stupid benchmark (https://simonwillison.net/2025/Nov/13/training-for-pelicans-riding-bicycles/)\. I don’t think they do, but honestly this result did give me a little glint of suspicion\. So I’m burning one of my secret backup tests—here’s what I got from Qwen3\.6\-35B\-A3B and Opus 4\.7 for “Generate an SVG of a flamingo riding a unicycle”:
I’m giving this one to Qwen too, partly for the excellent`<\!\-\- Sunglasses on flamingo\! \-\-\>`SVG comment\.
#### What can we learn from this?
The pelican benchmark has always been meant as a joke—it’s mainly a statement on how obtuse and absurd the task of comparing these models is\.
The weird thing about that joke is that, for the most part, there has been a direct correlation between the quality of the pelicans produced and the general usefulness of the models\. Thosefirst pelicans from October 2024 (https://simonwillison.net/2024/Oct/25/pelicans-on-a-bicycle/)were junk\. Themore recent entries (https://simonwillison.net/tags/pelican-riding-a-bicycle/)have generally been much, much better—to the point that Gemini 3\.1 Pro producesillustrations you could actually use somewhere (https://simonwillison.net/2026/Feb/19/gemini-31-pro/), provided you had a pressing need to illustrate a pelican riding a bicycle\.
Today, even that loose connection to utility has been broken\. I have enormous respect for Qwen, but I very much doubt that a 21GB quantized version of their latest model is more powerful or useful than Anthropic’s latest proprietary release\.
If the thing you need is an SVG illustration of a pelican riding a bicycle though, right now Qwen3\.6\-35B\-A3B running on a laptop is a better bet than Opus 4\.7\!