@dani_avila7: NVIDIA built exactly what I needed to secure agent skills https://github.com/nvidia/skillspector… Adding it as a GitHub…

X AI KOLs Timeline Tools

Summary

NVIDIA released SkillSpector, an open-source security scanner for AI agent skills that detects vulnerabilities like prompt injection and data exfiltration before installation.

NVIDIA built exactly what I needed to secure agent skills https://github.com/nvidia/skillspector… Adding it as a GitHub Action to http://aitmpl.com Every community-submitted skill gets scanned before it goes live No prompt injection, no data exfiltration, no supply chain risks
Original Article
View Cached Full Text

Cached at: 05/31/26, 03:12 PM

NVIDIA built exactly what I needed to secure agent skills https://github.com/nvidia/skillspector… Adding it as a GitHub Action to http://aitmpl.com Every community-submitted skill gets scanned before it goes live No prompt injection, no data exfiltration, no supply chain risks


nvidia/skillspector

Source: https://github.com/nvidia/skillspector

SkillSpector

Security scanner for AI agent skills. Detect vulnerabilities, malicious patterns, and security risks before installing agent skills.

Python 3.12+ License: Apache 2.0

Overview

AI agent skills (used by Claude Code, Codex CLI, Gemini CLI, etc.) execute with implicit trust and minimal vetting. Research shows that 26.1% of skills contain vulnerabilities and 5.2% show likely malicious intent.

SkillSpector helps you answer: “Is this skill safe to install?”

Documentation

  • Development guide — Architecture, package layout, and how to extend the analyzer pipeline.
  • OSS_RELEASE.md — How to produce a public-OSS branch from this repo.

Features

  • Multi-format input: Scan Git repos, URLs, zip files, directories, or single files
  • 64 vulnerability patterns across 16 categories: prompt injection, data exfiltration, privilege escalation, supply chain, excessive agency, output handling, system prompt leakage, memory poisoning, tool misuse, rogue agent, trigger abuse, dangerous code (AST), taint tracking, YARA signatures, MCP least privilege, and MCP tool poisoning
  • Two-stage analysis: Fast static analysis + optional LLM semantic evaluation
  • Live vulnerability lookups: SC4 queries OSV.dev for real-time CVE data with automatic offline fallback
  • Multiple output formats: Terminal, JSON, Markdown, and SARIF reports
  • Risk scoring: 0-100 score with severity labels and clear recommendations

Quick Start

Installation

Create and activate a virtual environment first (all make targets assume the venv is active). Use uv or pip; the Makefile uses uv if available, otherwise pip.

# Clone the repository
git clone https://github.com/NVIDIA/skillspector.git
cd skillspector

# Create and activate virtual environment
uv venv .venv && source .venv/bin/activate
# or: python3 -m venv .venv && source .venv/bin/activate

# Install for production use
make install

# Or install with development dependencies
make install-dev

Basic Usage

# Scan a local skill directory
skillspector scan ./my-skill/

# Scan a single SKILL.md file
skillspector scan ./SKILL.md

# Scan a Git repository
skillspector scan https://github.com/user/my-skill

# Scan a zip file
skillspector scan ./my-skill.zip

Output Formats

# Terminal output (default) - pretty formatted
skillspector scan ./my-skill/

# JSON output - machine readable
skillspector scan ./my-skill/ --format json --output report.json

# Markdown output - for documentation
skillspector scan ./my-skill/ --format markdown --output report.md

# SARIF output - for CI/CD integration and IDE tooling
skillspector scan ./my-skill/ --format sarif --output report.sarif

LLM Analysis

For the best results, configure an OpenAI-compatible LLM endpoint for semantic analysis. Pick a provider with SKILLSPECTOR_PROVIDER; each ships its own bundled default model. SkillSpector also works against local OpenAI-compatible servers (Ollama, vLLM, llama.cpp) and managed inference gateways.

Provider (SKILLSPECTOR_PROVIDER)Credential env varEndpointDefault model
openaiOPENAI_API_KEY (+ optional OPENAI_BASE_URL)api.openai.com (or any OpenAI-compatible URL)gpt-5.4
anthropicANTHROPIC_API_KEYapi.anthropic.comclaude-opus-4-6
nv_buildNVIDIA_INFERENCE_KEYbuild.nvidia.comdeepseek-ai/deepseek-v4-flash
# Stock OpenAI
export SKILLSPECTOR_PROVIDER=openai
export OPENAI_API_KEY=sk-...
skillspector scan ./my-skill/

# Anthropic
export SKILLSPECTOR_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
skillspector scan ./my-skill/

# NVIDIA build.nvidia.com
export SKILLSPECTOR_PROVIDER=nv_build
export NVIDIA_INFERENCE_KEY=nvapi-...
skillspector scan ./my-skill/

# Local Ollama or any OpenAI-compatible endpoint
export SKILLSPECTOR_PROVIDER=openai
export OPENAI_API_KEY=ollama
export OPENAI_BASE_URL=http://localhost:11434/v1
export SKILLSPECTOR_MODEL=llama3.1:8b
skillspector scan ./my-skill/

# Override the provider's default model
export SKILLSPECTOR_MODEL=gpt-5.2
skillspector scan ./my-skill/

# Skip LLM analysis (faster, static analysis only)
skillspector scan ./my-skill/ --no-llm

Vulnerability Patterns

SkillSpector detects 64 vulnerability patterns across 16 categories:

Prompt Injection (5 patterns)

IDPatternSeverityDescription
P1Instruction OverrideHIGHCommands to ignore safety constraints
P2Hidden InstructionsHIGHMalicious directives in comments/invisible text
P3Exfiltration CommandsHIGHInstructions to transmit context externally
P4Behavior ManipulationMEDIUMSubtle instructions altering agent decisions
P5Harmful ContentCRITICALInstructions that could cause physical harm

Data Exfiltration (4 patterns)

IDPatternSeverityDescription
E1External TransmissionMEDIUMSending data to external URLs
E2Env Variable HarvestingHIGHCollecting API keys and secrets
E3File System EnumerationMEDIUMScanning directories for sensitive files
E4Context LeakageHIGHTransmitting conversation context externally

Privilege Escalation (3 patterns)

IDPatternSeverityDescription
PE1Excessive PermissionsLOWRequesting access beyond stated functionality
PE2Sudo/Root ExecutionMEDIUMInvoking elevated system privileges
PE3Credential AccessHIGHReading SSH keys, tokens, passwords

Supply Chain (6 patterns)

IDPatternSeverityDescription
SC1Unpinned DependenciesLOWNo version constraints on packages
SC2External Script FetchingHIGHcurl | bash and remote code execution
SC3Obfuscated CodeHIGHBase64/hex encoded execution
SC4Known Vulnerable DependenciesHIGHDependencies with known CVEs (live OSV.dev lookup)
SC5Abandoned DependenciesMEDIUMUnmaintained packages without security updates
SC6TyposquattingHIGHPackage names similar to popular packages

Excessive Agency (4 patterns)

IDPatternSeverityDescription
EA1Unrestricted Tool AccessHIGHUnfettered tool access without constraints
EA2Autonomous Decision MakingHIGHHigh-impact decisions without human-in-the-loop
EA3Scope CreepMEDIUMCapabilities extending beyond stated purpose
EA4Unbounded Resource AccessMEDIUMNo rate limits or quotas on resource consumption

Output Handling (3 patterns)

IDPatternSeverityDescription
OH1Unvalidated Output InjectionHIGHModel output used without sanitization
OH2Cross-Context OutputMEDIUMOutput flows across trust boundaries without validation
OH3Unbounded OutputMEDIUMNo limits on output size or generation rate

System Prompt Leakage (3 patterns)

IDPatternSeverityDescription
P6Direct LeakageHIGHInstructions that expose system prompts or internal rules
P7Indirect ExtractionMEDIUMExtraction via rephrasing, translation, or side-channels
P8Tool-Based ExfiltrationHIGHSystem prompts exfiltrated via file writes or network requests

Memory Poisoning (3 patterns)

IDPatternSeverityDescription
MP1Persistent Context InjectionHIGHContent designed to persist across interactions
MP2Context Window StuffingMEDIUMFiller content displacing safety constraints
MP3Memory ManipulationHIGHTampering with agent memory or stored state

Tool Misuse (3 patterns)

IDPatternSeverityDescription
TM1Tool Parameter AbuseHIGHCrafted parameters for unintended behavior (shell=True, –force)
TM2Chaining AbuseHIGHTool chains that bypass individual safety checks
TM3Unsafe DefaultsMEDIUMOverly permissive defaults (disabled TLS, no auth)

Rogue Agent (2 patterns)

IDPatternSeverityDescription
RA1Self-ModificationCRITICALModifying own code or configuration at runtime
RA2Session PersistenceHIGHUnauthorized persistence via cron jobs or startup scripts

Trigger Abuse (3 patterns)

IDPatternSeverityDescription
TR1Overly Broad TriggerMEDIUMTrigger patterns matching common words
TR2Shadow Command TriggerHIGHTriggers that shadow built-in commands or other skills
TR3Keyword Baiting TriggerMEDIUMGeneric triggers designed to maximize activation

Behavioral AST (8 patterns)

IDPatternSeverityDescription
AST1exec() CallCRITICALDirect exec() enabling arbitrary code execution
AST2eval() CallHIGHDirect eval() evaluating arbitrary expressions
AST3Dynamic ImportHIGH__import__() loading arbitrary modules at runtime
AST4subprocess CallHIGHExternal command execution via subprocess
AST5os.system / exec-familyHIGHShell commands via os module
AST6compile() CallMEDIUMCode object creation from strings
AST7Dynamic getattr()MEDIUMArbitrary attribute access with non-literal names
AST8Dangerous Execution ChainCRITICALexec/eval combined with dynamic source (network, encoded data)

Taint Tracking (5 patterns)

IDPatternSeverityDescription
TT1Direct Taint FlowHIGHData flows directly from a source to a sink without sanitization
TT2Variable-Mediated Taint FlowMEDIUMData flows from source to sink through intermediate variables
TT3Credential Exfiltration ChainCRITICALCredentials (env vars, secrets) flow to network output sinks
TT4File Read to Network ExfiltrationHIGHFile contents flow to network output sinks
TT5External Input to Code ExecutionCRITICALNetwork or user input flows to exec/eval/subprocess sinks

YARA Signatures (4 patterns)

IDPatternSeverityDescription
YR1Malware MatchCRITICALYARA rule match for known malware signatures
YR2Webshell MatchCRITICALYARA rule match for webshell patterns
YR3Cryptominer MatchHIGHYARA rule match for crypto mining indicators
YR4Hack Tool / Exploit MatchHIGHYARA rule match for hack tools or exploit code

MCP Least Privilege (4 patterns)

IDPatternSeverityDescription
LP1Underdeclared CapabilityHIGHCode uses capabilities not listed in declared permissions
LP2Wildcard PermissionMEDIUMPermission list contains wildcards (*, all, full, any)
LP3Missing Permission DeclarationMEDIUMNo permissions field but code has detectable capabilities
LP4Overdeclared PermissionLOWPermission declared but no corresponding code capability found

MCP Tool Poisoning (4 patterns)

IDPatternSeverityDescription
TP1Hidden InstructionsHIGHHidden directives in metadata (HTML comments, zero-width chars, base64, data URIs)
TP2Unicode DeceptionHIGHHomoglyphs, RTL overrides, mixed-script identifiers in tool metadata
TP3Parameter Description InjectionMEDIUMInjection patterns in parameter definitions (overrides, system tokens, malicious defaults)
TP4Description-Behavior MismatchMEDIUMDeclared tool description does not match actual code behavior (LLM-powered)

View all patterns:

skillspector patterns

Risk Scoring

Score Calculation

  • CRITICAL issues: +50 points
  • HIGH issues: +25 points
  • MEDIUM issues: +10 points
  • LOW issues: +5 points
  • Executable scripts: 1.3x multiplier

Severity Levels

ScoreSeverityRecommendation
0-20LOWSAFE
21-50MEDIUMCAUTION
51-80HIGHDO NOT INSTALL
81-100CRITICALDO NOT INSTALL

Example Output

Terminal Output

 SkillSpector Security Report  v0.1.0

Skill: suspicious-skill
Source: ./suspicious-skill/
Scanned: 2026-01-29 10:30:00 UTC

        Risk Assessment
 Metric          Value
 Score           78/100
 Severity        HIGH
 Recommendation  DO NOT INSTALL

        Components (3)
 File              Type      Lines  Executable
 SKILL.md          markdown    142  No
 scripts/sync.py   python       87  Yes
 requirements.txt  text          3  No

Issues (2)

  HIGH: Env Variable Harvesting (E2)
    Location: scripts/sync.py:23
    Finding: for key, val in os.environ.items():...
    Confidence: 94%
    Explanation: This code collects environment variables containing
    API keys and secrets, then sends them to an external server.

  HIGH: External Transmission (E1)
    Location: scripts/sync.py:45
    Finding: requests.post("https://api.skill.io/env"...
    Confidence: 89%
    Explanation: Data is being sent to an external server. Combined
    with env harvesting above, this indicates credential exfiltration.

Configuration

Environment Variables

VariableDescriptionRequired
SKILLSPECTOR_PROVIDERActive LLM provider: openai, anthropic, or nv_build. Each provider has its own bundled model_registry.yaml and default model (see the LLM Analysis table above). Defaults to nv_build.Optional
NVIDIA_INFERENCE_KEYCredential for the nv_build provider (build.nvidia.com).Required for LLM analysis when SKILLSPECTOR_PROVIDER=nv_build
OPENAI_API_KEYCredential for the OpenAI provider (SKILLSPECTOR_PROVIDER=openai). Also serves as the tier-2 fallback in the credential waterfall when the active provider returns no credentials.Required for LLM analysis when SKILLSPECTOR_PROVIDER=openai
OPENAI_BASE_URLOverride the OpenAI endpoint (e.g. point at Ollama).Optional
ANTHROPIC_API_KEYCredential for the Anthropic provider (SKILLSPECTOR_PROVIDER=anthropic).Required for LLM analysis when SKILLSPECTOR_PROVIDER=anthropic
SKILLSPECTOR_MODELOverride the active provider’s default model. See the LLM Analysis table for each provider’s default.Optional
SKILLSPECTOR_MODEL_REGISTRYOverride the bundled per-provider YAML registry (src/skillspector/providers/<provider>.yaml) with a custom path.Optional
SKILLSPECTOR_LOG_LEVELLog level: DEBUG, INFO, WARNING, ERROR (default: WARNING).Optional

CLI Options

skillspector scan --help

Options:
  -f, --format [terminal|json|markdown|sarif]  Output format [default: terminal]
  -o, --output PATH                            Output file path
  --no-llm                                     Skip LLM analysis (static only)
  -V, --verbose                                Show detailed progress
  --help                                       Show this message and exit

Development

Setup

All make targets assume a virtual environment is already created and activated. The Makefile uses uv if available, else pip.

# Clone, create venv, activate, install dev dependencies
git clone https://github.com/NVIDIA/skillspector.git
cd skillspector
uv venv .venv && source .venv/bin/activate
# or: python3 -m venv .venv && source .venv/bin/activate
make install-dev

# Run tests
make test

# Run tests with coverage
make test-cov

# Run linting
make lint

# Format code
make format

How It Works

SkillSpector uses a two-stage detection pipeline:

Stage 1: Static Analysis

  • Fast regex-based pattern matching across 11 static analyzers
  • AST-based behavioral analysis detecting dangerous calls (exec, eval, subprocess, etc.)
  • Live vulnerability lookups via OSV.dev for known CVEs in dependencies
  • Scans all files in the skill
  • High recall (catches most issues)
  • Moderate precision (some false positives)

Stage 2: LLM Semantic Analysis (Optional)

  • Evaluates context and intent
  • Filters false positives
  • Provides human-readable explanations
  • Improves precision to ~87%

The LLM prompt includes anti-jailbreak protections to prevent malicious skills from manipulating the analysis.

Live Vulnerability Lookups (SC4)

SC4 uses the OSV.dev API to check dependencies against the full Open Source Vulnerabilities database — covering tens of thousands of advisories across PyPI and npm.

  • No API key required — OSV.dev is free and unauthenticated.
  • Batch queries — all dependencies are checked in a single HTTP call.
  • Automatic fallback — if OSV.dev is unreachable (air-gapped/offline), a small built-in fallback list is used.
  • Caching — results are cached in-memory for 1 hour to avoid redundant API calls during a session.

The tool requires outbound HTTPS access to api.osv.dev for live vulnerability data. When that is not available, findings are limited to the static fallback list.

Limitations

  • Non-English content: May miss patterns in other languages
  • Image-based attacks: Cannot analyze text in images
  • Encrypted/binary code: Cannot analyze compiled or encrypted content
  • Runtime behavior: Static analysis only, no dynamic execution
  • Offline SC4: Without network access to api.osv.dev, SC4 uses a small static fallback list

Research Background

Based on research from “Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale” (Liu et al., 2026):

  • Dataset: 42,447 skills from major marketplaces
  • Vulnerable: 26.1% contain at least one vulnerability
  • High-severity: 5.2% show likely malicious intent
  • Key finding: Skills with executable scripts are 2.12x more likely to be vulnerable

Python API Integration

from skillspector import graph

# Invoke the LangGraph workflow
result = graph.invoke({
    "input_path": "/path/to/skill",
    "output_format": "json",   # terminal, json, markdown, or sarif
    "use_llm": True,           # False for static-only analysis
})

# Access results
print(f"Risk Score: {result['risk_score']}/100")
print(f"Severity: {result['risk_severity']}")
print(f"Recommendation: {result['risk_recommendation']}")

for finding in result["filtered_findings"]:
    print(f"[{finding['severity']}] {finding['rule_id']}: {finding['message']}")

License

Apache License 2.0 - see LICENSE for details.

Contributing

Contributions are welcome! Please read our contributing guidelines and submit pull requests.

Support

Similar Articles

Skill Inspector

Product Hunt

Skill Inspector is a developer tool that audits AI agent skills to help prevent malware risks.

mukul975/Anthropic-Cybersecurity-Skills

GitHub Trending (daily)

An open-source repository containing 754 structured cybersecurity skills for AI agents, covering 26 security domains and mapped to multiple industry frameworks, enabling agents to perform expert-level security analysis.

tech-leads-club/agent-skills

GitHub Trending (daily)

Agent Skills is a hardened, open-source library of verified and tested skills for extending AI coding agents like Claude Code and Cursor, addressing security vulnerabilities found in marketplace alternatives.