Tag
A developer shares lessons from building a local document-to-JSON extractor using llama3.2 3B on Ollama, highlighting that deterministic post-processing and schema-constrained outputs matter more than model size, while seeking feedback on hallucination and context truncation issues with long documents.
The author upgraded their Hermes agents with TencentDB Agent Memory, using a local Qwen 3.5-4B model via llama-server for structured JSON extraction and multi-step tool use, implementing a resilient layered memory pipeline with cursor-based checkpointing.
Numind released NuExtract3, a 4B open-weight vision-language model based on Qwen3.5-4B, designed for converting document images to Markdown, OCR, and structured data extraction. It is Apache-2.0 licensed and self-hostable with quantized versions for low VRAM.
NuExtract3 is a 4B vision-language reasoning model for document understanding, enabling structured extraction and image-to-Markdown conversion.