@stevibe: Qwen3.6 35B A3B can't fill out a paper form on its own. But give it NVIDIA's LocateAnything-3B — the #1 trending model …

X AI KOLs Timeline 06/02/26, 05:48 PM News

multi-model tool-use vision-language form-filling small-models nvidia qwen

Summary

A demonstration shows that Qwen3.6 35B A3B combined with NVIDIA's LocateAnything-3B as a vision tool can accurately fill out a paper form by detecting field positions, proving that small models can collaborate to accomplish tasks beyond a single large model's capability.

Qwen3.6 35B A3B can't fill out a paper form on its own. But give it NVIDIA's LocateAnything-3B — the #1 trending model on HuggingFace — as its eyes, and the two small models get it done together. (The test: place each element at the right pixel position on a blank form image, not type into a field.) Setup: > Qwen is the brain (main model), LocateAnything is the eyes (helper model acting as a tool). > I gave Qwen a new tool: ask "where's the email field?" and LocateAnything returns the exact x, y, width, height. > The blue boxes on the screen are its detections. Look how tight they are — it nails every field. Result: > Qwen3.6 35B A3B + LocateAnything-3B: form completed, all info correct. > Name, DOB, ID, gender, marital status, nationality, email, phone, address, postal code: all landed in the right field areas. > Character-box alignment still a touch loose, but every value is where it belongs. > 9m10s, 224.5k input, 24.3k output, 21 turns. Why it matters: > Qwen alone can't finish this test. Bolt on a 3B model that does exactly one thing > locate > and suddenly it can. > A combination of small models can do the work of a single large one.

Original Article

View Cached Full Text

Cached at: 06/02/26, 09:37 PM

Qwen3.6 35B A3B can’t fill out a paper form on its own. But give it NVIDIA’s LocateAnything-3B — the #1 trending model on HuggingFace — as its eyes, and the two small models get it done together.

(The test: place each element at the right pixel position on a blank form image, not type into a field.)

Setup:

Qwen is the brain (main model), LocateAnything is the eyes (helper model acting as a tool). I gave Qwen a new tool: ask “where’s the email field?” and LocateAnything returns the exact x, y, width, height. The blue boxes on the screen are its detections. Look how tight they are — it nails every field.

Result:

Qwen3.6 35B A3B + LocateAnything-3B: form completed, all info correct. Name, DOB, ID, gender, marital status, nationality, email, phone, address, postal code: all landed in the right field areas. Character-box alignment still a touch loose, but every value is where it belongs. 9m10s, 224.5k input, 24.3k output, 21 turns.

Why it matters:

Qwen alone can’t finish this test. Bolt on a 3B model that does exactly one thing > locate > and suddenly it can. A combination of small models can do the work of a single large one.

LocateAnything-3B

Kept the vision, as it took screen captures to verify the results too.

Yeah, I actually started with the 9B, but its tool calling capabilities weren’t doing great, and the reasoning to put data on the fields was a bit off, so I finally switched to the 35B A3B.

I’d say it was more stable. The 35B A3B still has some repeated tool calls sometimes, but the 27B was just solid. The only problem is it’s slower than the 35B A3B on the same hardware.

@stevibe: Qwen3.6 35B A3B can't fill out a paper form on its own. But give it NVIDIA's LocateAnything-3B — the #1 trending model …

Similar Articles

@stevibe: I explored a further possibility with local models: Qwen3.6 35B A3B + NVIDIA LocateAnything-3B as a local Computer Use …

The Qwen 3.6 35B A3B hype is real!!!

nvidia/Qwen3.6-35B-A3B-NVFP4 · Hugging Face

Qwen3.6 35Ba3 has changed my workflows and even how I use my computer

Wow! Qwen 3.6:35b-a3b on a 3090... pretty amazing.

Submit Feedback

Similar Articles

@stevibe: I explored a further possibility with local models: Qwen3.6 35B A3B + NVIDIA LocateAnything-3B as a local Computer Use …

The Qwen 3.6 35B A3B hype is real!!!

nvidia/Qwen3.6-35B-A3B-NVFP4 · Hugging Face

Qwen3.6 35Ba3 has changed my workflows and even how I use my computer

Wow! Qwen 3.6:35b-a3b on a 3090... pretty amazing.