ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree
Summary
This paper investigates security scanner disagreement for AI agent skills, finding that VirusTotal, static analysis, and NVIDIA SkillSpector flag different skills with minimal overlap. It releases a sanitized dataset of over 67,000 skill versions to support further research on layered security governance.
View Cached Full Text
Cached at: 06/03/26, 07:36 AM
Paper page - ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree
Source: https://huggingface.co/papers/2606.01494
Abstract
Agent skills require layered security governance due to scanner disagreement, with findings showing varying detection rates across different scanner types and attack surfaces.
Agent skills extend AI agents with reusable instructions, tools, scripts, references, and workflows, establishing a security boundary distinct from both model safety and traditional package-malware detection.ClawHub Security Signalsis a sanitized dataset of 67,453 latest public OpenClaw skill versions. Each row pairs redacted SKILL.md content and sanitized bundled files where present with a finalClawScan registry verdictand evidence from three scanner families:VirusTotal,static heuristic analysis, andNVIDIA SkillSpector. Rather than estimating malicious-skill prevalence, we study scanner disagreement. The three scanners rarely flag the same skills: any pair overlaps on at most 10.4% of their combined positives, only 0.69% of skills are flagged by all three, and 81.9% of flagged skills are identified by a single scanner. The disagreement is structured by attack surface. SkillSpector, which raisessemantic agentic-risk advisoriesrather thanmalware-reputation signals, is positive for 19,209 of 25,504 suspicious rows (75.3%) but only 14 of 206 malicious rows (6.8%). The malicious-verdict region shows the inverse profile: 150 of 206 malicious rows (72.8%) areVirusTotal-positive, consistent with bundled-code malware evidence. These results show that agent-skill security requires layered governance, not single-scanner allow/block decisions. The corpus is released as a sanitized silver-standard dataset: labels are the registry’s automated verdicts, not human-annotated ground truth, and the release represents an early, versioned snapshot intended to support the community while a human-annotated subset is developed. Further research is encouraged, including models tailored for skill-security triage.
View arXiv pageView PDFProject pageAdd to collection
Get this paper in your agent:
hf papers read 2606\.01494
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2606.01494 in a model README.md to link it from this page.
Datasets citing this paper1
#### OpenClaw/clawhub-security-signals Viewer• Updatedabout 2 hours ago • 67.5k • 84 • 24
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2606.01494 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
@dani_avila7: NVIDIA built exactly what I needed to secure agent skills https://github.com/nvidia/skillspector… Adding it as a GitHub…
NVIDIA released SkillSpector, an open-source security scanner for AI agent skills that detects vulnerabilities like prompt injection and data exfiltration before installation.
Security for your OpenClaw agent skill before they run
SecureSkill is a tool that performs 10-layer security analysis on OpenClaw agent skills before execution, detecting threats like credential harvesting, outbound calls, and shell scripts. It produces a signed audit report mapped to OWASP, MITRE, NIST, and EU AI Act standards.
Skill Inspector
Skill Inspector is a developer tool that audits AI agent skills to help prevent malware risks.
mukul975/Anthropic-Cybersecurity-Skills
An open-source repository containing 754 structured cybersecurity skills for AI agents, covering 26 security domains and mapped to multiple industry frameworks, enabling agents to perform expert-level security analysis.
I got paranoid about OpenClaw skills injecting crap into my system prompt, so I built a quarantine pipeline with two LLMs as reviewers (93.75% detection, zero false negatives)
A developer built a quarantine pipeline using two LLM reviewers (Claude and Codex) to detect injection attacks in OpenClaw skills, achieving 93.75% detection rate with zero false negatives. The system uses a dual mandate of checklist-based pattern matching and open analysis to catch both known and novel injection techniques.