Tag
Vision Agents is an open-source Python framework for building multimodal AI agents that process video and audio in real time. It enables conversational agents to adapt their voice based on facial expressions and gaze using MediaPipe.