I trust-scored 171 open-source AI agents — most can't prove their supply chain

Reddit r/AI_Agents Tools

Summary

A developer created an independent trust registry for 171 open-source AI agents, scoring them on verifiable trust signals like supply chain security and maintenance, finding that only three agents achieved a Grade A rating while many popular agents lacked basic verification.

I've been building an independent trust registry for open-source AI agents and the findings have been eye-opening. The short version: I track 171 agents across 14 categories (coding agents, frameworks, browser agents, memory systems, etc.) and score them on verifiable trust signals — not stars or hype. The signals include OSSF Scorecard, build provenance (SLSA), signed commits, license transparency, and maintenance patterns. **What surprised me:** * Only 3 out of 171 agents have enough independent signal coverage to earn a Grade A (broad verifiable evidence across multiple dimensions) * Some of the most-starred agents score poorly on trust because they have zero supply-chain verification — no scorecard, no provenance, no signed commits * The agent with 166k GitHub stars ranked #108 on trust (partly a data bug I've since fixed, partly genuine: popularity ≠ verifiability) * Agents that *do* publish provenance and pass OSSF checks are often mid-tier on stars but rank near the top on trust **How the scoring works:** The formula weights signals by how hard they are to fake: * Safety/Integrity (30 pts): OSSF Scorecard, build provenance, signed commits * Identity (20 pts): verified listing + provenance binding * Transparency (20 pts): license + OSSF transparency checks * Maintenance (20 pts): commit freshness + activity * Adoption (10 pts): log-scaled, capped stars + downloads Then the raw score gets multiplied by a confidence factor (how many signal types we actually have data for) — so an agent we can't verify much about *can't* reach the top tier even if it's popular. **Why I built this:** With MCP and A2A taking off, agents are about to start calling other agents. There's currently no standardized way to answer "should Agent A trust Agent B?" before they interact. I'm trying to build toward that — the trust data is open (CC BY 4.0), machine-readable, and there's a compare tool with radar charts if you want to see how specific agents stack up. Would love feedback on the methodology or agents you think are missing. The full leaderboard is at hvtracker and the methodology is published.
Original Article

Similar Articles

Towards trustworthy agentic AI: a comprehensive survey of safety, robustness, privacy, and system security

arXiv cs.AI

This survey provides a comprehensive examination of trustworthy agentic AI, focusing on safety, robustness, privacy, and system security. It clarifies key concepts, identifies risks along the agent workflow, summarizes mitigation strategies, and consolidates evaluation metrics and benchmarks, aiming to serve as a practical reference for deploying agentic AI in high-stakes environments.