lucataco/moondream2
Summary
moondream2 is a compact vision language model designed for efficient edge device inference, with benchmark results and usage instructions provided.
View Cached Full Text
Cached at: 05/22/26, 10:15 PM
Similar Articles
MiniCPM-V 4.6
MiniCPM-V 4.6 is an ultra-efficient 1.3B vision-language model optimized for mobile devices.
stepfun-ai/Step-3.7-Flash
Step 3.7 Flash is a 198B-parameter sparse MoE vision-language model with 11B active parameters per token, supporting 256k context and three reasoning levels, designed for high-throughput agentic workflows.
@AdinaYakup: Step-3.7-Flash New VL model from @StepFun_ai 198B / 11B active - MoE 256K context 3 reasoning level Up to 400 tokens/sec
StepFun releases Step-3.7-Flash, a new large vision-language MoE model with 198B parameters (11B active), 256K context, and up to 400 tokens/sec inference speed.
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion
SmolDocling is a compact 256M parameter vision-language model designed for end-to-end multi-modal document conversion. It introduces a new universal markup format called DocTags to capture page elements with location, competing with models 27 times larger.
Liquid AI reveals 8B-A1B MoE trained on 38T
Liquid AI released LFM2.5-8B-A1B, an edge MoE model trained on 38T tokens with a 128K context window, improved tool calling, and reasoning capabilities, available on Hugging Face.