Tag
Perceptron Inc. released its flagship video analysis model Mk1, claiming 80-90% lower cost than competitors while achieving strong performance on spatial and video reasoning benchmarks.
This paper introduces a neuro-symbolic pipeline using 2.5-D decomposition to improve LLM-based spatial construction accuracy by offloading vertical coordinate calculation to a deterministic executor, achieving high accuracy on benchmarks and edge hardware.
Google DeepMind releases Gemini Robotics-ER 1.6, an upgraded model with enhanced visual and spatial understanding to enable robots to better reason about and interact with the physical world.
Google DeepMind introduces Gemini Robotics-ER 1.6, a specialized AI model enhancing embodied reasoning for robotics through improved spatial awareness, task planning, and instrument reading capabilities.