@rauchg: Mythos / Sol cybersecurity capabilities are equally useful in an offensive as well a defensive capacity. If adversaries…
Summary
deepsec is an agent-powered vulnerability scanner that uses frontier AI models to review large codebases, finding hard-to-detect security issues. It can be run locally or on your own infrastructure and supports parallel scanning, resume capability, and customizable matchers.
View Cached Full Text
Cached at: 06/28/26, 05:56 AM
Mythos / Sol cybersecurity capabilities are equally useful in an offensive as well a defensive capacity.
If adversaries get ahold of an equivalent offensive capability, it poses a serious threat to US companies that remain unaware of latent vulnerabilities.
In the meantime, I strongly recommend running deepsec[1] or similar harnesses with the available frontier models.
[1] https://github.com/vercel-labs/deepsec…
vercel-labs/deepsec
Source: https://github.com/vercel-labs/deepsec
deepsec
deepsec an agent-powered vulnerability scanner that you can run in your own infrastructure, optimized to perform on-demand review of all code in existing
large-scale repos.
deepsec is designed to surface hard-to-find issues that have been lurking in applications for a long time. It is configured to use the best models at maximum thinking levels, meaning scans can cost thousands or even tens-of-thousands of dollars for large codebases. Our customers have found the cost worth it for how quickly they were able to patch vulnerabilities that would have otherwise gone unfixed.
For large codebases, work fans out across worker machines in parallel. If a run is interrupted or errors out partway through, just re-run the same command — deepsec picks up where it left off, skipping files it already analyzed and only investigating the rest.
Get started
Navigate to the root of the repository that you want to scan, then:
npx deepsec init # creates .deepsec/ with this repo as the first project
cd .deepsec
pnpm install # installs deepsec from npm
# Proceed as instructed by `init` output
Now have your coding agent bootstrap your installation. Open the agent of choice and prompt:
Read
.deepsec/node_modules/deepsec/SKILL.mdto understand the tool. Then read.deepsec/data/<id>/SETUP.mdand follow it: skim this repo’s README, any AGENTS.md/CLAUDE.md, and a handful of representative code files, then replace each section of.deepsec/data/<id>/INFO.md.Keep it SHORT — target 50–100 lines total. Pick 3–5 examples per section, not exhaustive enumeration. Name primitives (auth helpers, middleware) but no line numbers. Skip generic CWE categories — built-in matchers cover those. Cover only what’s project-specific. INFO.md is injected into every scan batch; verbose context dilutes signal.
Then scan from inside .deepsec/:
pnpm deepsec scan
pnpm deepsec process
pnpm deepsec revalidate # optional, cuts FP rate
pnpm deepsec export --format md-dir --out ./findings
If you feel like the deepsec should look at more parts of the code, give it the writing matchers doc to find more valuable starting points in your code base.
Docs
- docs/getting-started.md — first-scan walkthrough
- docs/reviewing-changes.md —
process --difffor PR review and CI gating - docs/supported-tech.md — frameworks and ecosystems deepsec recognizes out of the box
- docs/writing-matchers.md — prompt your coding agent to grow your matcher set
- docs/configuration.md —
deepsec.config.tsreference - docs/plugins.md — plugin authoring
- docs/models.md — model selection, defaults, refusals, future models
- docs/vercel-setup.md — AI Gateway + Vercel Sandbox keys / tokens
- docs/architecture.md — pipeline internals
- docs/data-layout.md —
data/schemas (FileRecord, RunMeta, …) - docs/faq.md — cost, model choice, sandbox mode, FP rate
- samples/ — copy-paste starting points (currently:
webapp/) - CONTRIBUTING.md — repo layout, dev workflow
AI provider
When running locally, deepsec falls back to your existing claude /
codex subscription if you’ve logged in on this machine. Subscriptions
(Claude Pro/Max, ChatGPT Plus) are useful for evaluating deepsec but
generally don’t have enough headroom for full repo scans.
For real scans, use Vercel AI Gateway. One key covers both Claude and Codex, and the gateway’s default quotas are sized for highly concurrent research.
AI_GATEWAY_API_KEY=vck_...
See docs/vercel-setup.md for getting a key and
for the Vercel Sandbox setup. To bypass the gateway, set
ANTHROPIC_AUTH_TOKEN + ANTHROPIC_BASE_URL (or the OpenAI pair)
explicitly. Explicit values always win over the AI_GATEWAY_API_KEY
expansion.
If a process or revalidate run halts because the upstream credential
ran out of quota or credits, deepsec stops gracefully and tells you
where to top up. Re-run the same command afterward and it picks up
where it left off.
Distributed execution (optional)
Large monorepos can fan work across Vercel Sandbox microVMs:
pnpm deepsec sandbox process --project-id my-app --sandboxes 10 --concurrency 4
Needs a Vercel account. The local working tree is tarballed and
uploaded; .git is excluded. Both OIDC tokens (local) and access
tokens (CI) are supported — see
docs/vercel-setup.md.
Security model of deepsec itself
Treat deepsec like a coding agent with full shell access on the enviroment that it is
running on. It is designed to run on trusted inputs (your source code) but you may still
be concerned about prompt injection due to external dependencies or vendored code.
Running on a sandbox (see above) does limit the potential exposure substantially:
- The API keys for the coding agents are injected outside of the sandbox and hence cannot be exfiltrated
- For the worker sandboxes, network egress from the sandbox is limited to coding agent hosts (Egress is allowed during the bootstrap process, but this does not run the coding agent)
Workflow reference
| Command | What it does |
|---|---|
scan | Find candidate sites with regex matchers (fast, no AI) |
process | AI investigation; emits findings + recommendation |
process --diff | PR-mode: scan + investigate only files changed in a diff |
triage | Lightweight P0/P1/P2 classification (cheaper model) |
revalidate | Re-check existing findings; checks git history for fixes |
enrich | Add git committer info + (with a plugin) ownership data |
report | Markdown + JSON summary for one project |
export | Per-finding JSON or directory of markdown files |
metrics | Cross-project counts: severities, vulns by type, TPs |
status | Snapshot of the project mirror |
sandbox <cmd> | Run any of the above on Vercel Sandbox microVMs |
License
Apache 2.0. See LICENSE and NOTICE.
Polymarket (@Polymarket): JUST IN: A new Chinese AI model from Zhipu AI reportedly matches Claude Mythos’ performance at finding security bugs.
Similar Articles
@logangraham: A lot of people have been wondering about Mythos, Glasswing, and the vulns we / our partners are fixing. Today, I’m exc…
Anthropic's Claude Mythos Preview model has been evaluated by XBOW and UK AISI, showing unprecedented autonomous cybersecurity capabilities, including solving end-to-end cyber ranges and finding thousands of vulnerabilities. The announcement emphasizes the need to prepare for rapidly advancing AI capabilities in cybersecurity.
Will It Mythos?
The author tests whether other AI models can match Mythos's exceptional ability to find security vulnerabilities, creating a benchmark of bugs found by Mythos and testing models like Opus. Initial results suggest Mythos may be uniquely powerful.
Microsoft's multi-agent AI system tops Anthropic's Mythos on cybersecurity benchmark (3 minute read)
Microsoft's MDASH multi-agent AI system, using over 100 specialized agents, surpasses Anthropic's Mythos on the CyberGym cybersecurity benchmark by effectively finding and confirming real-world software vulnerabilities.
@heyshrutimishra: We've been watching the wrong AI story. While the timeline keeps debating whether Mythos is real, hyped, or just well-m…
A thread contrasts the hype around AI security startup Mythos with 360's practical achievement of autonomously discovering 23 vulnerabilities (including two criticals) in the OpenClaw ecosystem, highlighting the real direction of AI security.
Claude Mythos Opens The Cybersecurity Pandora's box
Anthropic has unveiled Claude Mythos, a highly capable AI model designed to automatically discover security vulnerabilities in operating systems, browsers, and software libraries. Initially restricted to select enterprise and open-source partners under Project Glasswing due to dual-use risks, the release has sparked industry debate over AI security capabilities and corporate marketing tactics.