Tag
Vision Agents is an open-source Python framework for building multimodal AI agents that process video and audio in real time. It enables conversational agents to adapt their voice based on facial expressions and gaze using MediaPipe.
Adala is an open-source framework for autonomous data labeling agents that learn skills iteratively through interaction with ground truth datasets and LLM runtimes.