I trust-scored 171 open-source AI agents — most can't prove their supply chain
Summary
A developer created an independent trust registry for 171 open-source AI agents, scoring them on verifiable trust signals like supply chain security and maintenance, finding that only three agents achieved a Grade A rating while many popular agents lacked basic verification.
Similar Articles
If AI agents become everywhere, how do we know which ones to trust?
As AI agents become ubiquitous, the challenge shifts from comparing performance to establishing trust and reputation, requiring new discovery and verification systems.
Researchers at MIT documented 30 AI agents major labs are deploying. Only 4 had public docs saying what the agent does, what it can't do, and what happens if it breaks.
MIT researchers compiled the 2025 AI Agent Index, documenting 30 deployed AI agents from major labs, yet only 4 had public documentation explaining what the agent does, its limitations, and failure modes, revealing major transparency gaps.
Built an Open-Source Tool That Finds Missing Validation, Retries, and Error Handling in AI Agent Systems
We released Trustabl Agent Analyzer, an open-source tool that scans AI agent repositories to find missing validation, retries, and error handling, generating a privacy-preserving local report.
I built a live ranking of every AI agent and foundation model (open source)
A developer launched AgentTape, a live ranking site that aggregates data from multiple sources (GitHub, Hugging Face, OpenRouter, etc.) to score and compare public AI agents and foundation models, aiming to provide a more holistic evaluation beyond benchmarks.
Towards trustworthy agentic AI: a comprehensive survey of safety, robustness, privacy, and system security
This survey provides a comprehensive examination of trustworthy agentic AI, focusing on safety, robustness, privacy, and system security. It clarifies key concepts, identifies risks along the agent workflow, summarizes mitigation strategies, and consolidates evaluation metrics and benchmarks, aiming to serve as a practical reference for deploying agentic AI in high-stakes environments.