@AdamShao: Officially open-sourcing my vulnerability discovery tool: http://flounders.xyz This is an AI Agent-based fully automated vulnerability discovery workflow. You just tell the AI which project's vulnerabilities you want to find, and it will automatically download code and documentation, deeply audit the code, discover suspicious vulnerabilities, automatically verify them locally and online…
Summary
Flounder is an open-source AI agent-based tool that automates vulnerability discovery in codebases. Users describe the target and the tool autonomously downloads code, conducts deep code audits, tests vulnerabilities locally and online, and generates reports.
View Cached Full Text
Cached at: 06/24/26, 10:22 AM
I’ve officially open-sourced my vulnerability discovery tool: http://flounders.xyz
This is a fully automated vulnerability discovery workflow based on AI Agent. Just tell the AI which project’s vulnerabilities you want to find, and it will automatically download the code and documentation, conduct in-depth code audits, discover suspicious vulnerabilities, automatically verify them locally and online, and finally generate a report.
Alpha leak: If you have plenty of AI tokens to spare, you can give the agent a goal to search for bounty tasks on various white-hat platforms, find vulnerabilities, and earn bounties.
Flounder — An autonomous white-hat security auditor
Source: https://flounders.xyz/ Agent-driven security audits
Install the skill once. Ask Codex, Claude Code, or another skills-aware agent to prepare the target, audit the code, run proof tests, and collect the report.
Install skill (https://flounders.xyz/#quickstart)View on GitHub (https://github.com/adshao/flounder)
Works withCodexClaude CodeGemini CLICursorOpenCodeOpenHands
$npx skills add adshao/flounder \-g
Flounder skill — agent-driven audit
$npx skills addadshao/flounder-g
◇ skill installed for Codex / Claude Code
›Audit this repository with Flounder.
◇ target authorized boundary captured
◇ agent prepared workspace · mapped scope · dug promising regions
↳test runnerreturnedPASSwith command evidence
confirmed-executablereport package ready
✓sealed audit complete · network stayed off
Use it with an agent
Ask naturally. Flounder handles the audit contract.
The installed skill triggers from Flounder audit requests, daemon/provider setup, suspected-finding verification, real-finding confirmation, and report collection.
1install skillone-time setup
2ask agentCodex or Claude Code
3provesandboxed local tests
4reportprivate disclosure draft
agent owns strategy·Flounder owns safety and evidence
01natural language
Codex / Claude Code driver
No custom scenario pipeline
Ask for an authorized audit, verification, confirmation, or report package. The Flounder skill gives the agent the operating manual and keeps it on the workflow.
The source of truth isskills/flounder/SKILL\.md, not a marketing-only prompt.
02execution-backed
End-to-end audit system
Prep → audit → proof → report
Flounder can prepare the workspace, read source and corpus, map attack surface, dig promising regions, construct exploit paths, run proof tests, and collect reports.
The framework supplies sandboxing, command policy, durable state, gates, daemon execution, and reporting.
Local dashboard
Track audits while the agent works.
flounder uigives operators a localhost control plane for projects, daemons, provider profiles, runs, scopes, findings, live activity, and reports.
Flounder dashboard showing an Aztec Rollup demo audit with workflow phases, scope coverage, live activity, candidates, and report-ready reproduced findings
Project view: prepare → map → dig → synthesize → verify → confirm → report, with live model activity and finding-grained report actions.Daemon-owned executionLive tool and model activityFinding-grained Verify / Confirm / Report
Why Flounder
Thin framework. Strong guarantees.
Flounder is not a scanner, checklist runner, or set of hand-written bug rules. The model decides how to reason; Flounder makes the result usable.
Agent-native
Install the skill once. Codex, Claude Code, or another skills-aware agent can drive the workflow from a plain request.
Framework-agnostic
Source, corpus, and optional profiles are inputs. The audit strategy comes from the model, not a stack-specific scanner.
Execution-grounded
A finding is not real because the model says so. It must cite command evidence from a passing local proof test.
Blind then real
Discovery runs network-sealed. Reproduction can use real-world ground truth under white-hat no-broadcast rules.
Sandbox boundary
Model-written tests, PoCs, dependencies, and commands run in a copied workspace away from the host checkout.
Local control
The UI is a control plane. Audits run on a daemon, so target code and provider credentials stay on the executor host.
Use cases
Use Flounder when a security question needs proof.
Choose the path by what you already have: a clean target, a factual clue, a public bounty scope, local source, a suspected finding, or confirmed evidence.
blind capability audit### Measure unaided audit ability.
Start with an authorized project, repo, package, source tree, or project link and no bug hint.
Input: target only, no incident writeup incident investigation### Explain a suspicious transaction or exploit clue.
Use Prepare to collect chain facts, deployed source, official material, and reproduction requirements.
Input: transaction, address, exploit link open-world bounty### Audit with official public context.
Let Flounder gather bounty scope, docs, deployments, provenance, and package metadata before sealed audit.
Input: public program, repo, deployment source-provided audit### Audit code that is already staged locally.
Provide source paths, build root, and optional corpus to enter sealed map/dig directly.
Input: source, build root, docs targeted follow-up### Settle one claim or region.
Verify suspected findings, dig selected scopes, confirm a run, or continue from prior project state.
Output: confirmed, refuted, or narrowed disclosure prep### Package only evidence-backed bugs.
Consolidate duplicates, run real-target confirmation when needed, and regenerate selected reports.
Output: reports, decisions, command evidence
Prepare targetMap scopeDig deeplyRun proofCollect report
Proof boundary
Execution is the promotion rule.
A candidate stays suspected until it cites a passing confirmation-eligible command. The status is a framework verdict from command evidence, not the model’s assertion.
refuted
The claim failed reproduction or skeptic review.
suspected
Credible, but no passing cited test yet.
confirmed-executable
A real local test/build runner passed.
confirmed-differential
The same exploit is blocked by its own minimal fix.
-
1Model-owned strategy Flounder is not a stack scanner or checklist runner. Source, corpus, and optional profiles are inputs, not conclusions.
-
2Sandboxed execution Commands run in a copied workspace. The default OCI backend fails closed if the sandbox image is missing.
-
3Real test runners only Inspection commands cannot mint proof. Confirmation needs a command like
cargo test,forge test, orpytest. -
4Local control The control plane queues work; the daemon executes it. Target code and provider credentials stay on the executor host.
Quickstart
Install once. Ask your agent.
The Flounder skill is the product interface for Codex, Claude Code, and other skills-aware agents.
1. Install Skill
``
add Flounder to your agent once
$ npx skills add adshao/flounder -g ``
Installs the operating manual, safety boundary, and workflow contract.
2. Ask Agent
``
use plain language from Codex or Claude Code
› Audit this repository with Flounder. › Verify this suspected finding with Flounder. › Collect the execution-backed bug report package. ``
The agent handles setup, audit planning, proof runs, and report collection.
Dashboard, CLI, and REST API remain available when you want direct control.
White-hat by construction.
Flounder is forauthorizedauditing only — your own code or public bug-bounty scope. Discovery is network-sealed; reproduction may fork and read live networks butneverbroadcasts, moves funds, or writes to any live system — exploits replay against alocalfork only. Build the smallest proof needed, report privately, coordinate disclosure.
Read the security policy → (https://github.com/adshao/flounder/blob/main/SECURITY.md)
FAQ
Practical questions before you run it.
Answers for operators setting up their first agent-driven audit.
Is Flounder a local service or a cloud service?Flounder is local-first. The dashboard and control plane run on localhost by default, and audits execute on a daemon you control. That daemon can be on your machine or another executor host you connect; Flounder does not require uploading targets to a hosted Flounder cloud.
Is Flounder open source? What license?Yes. Flounder is open source under theGNU AGPL v3 (https://www.gnu.org/licenses/agpl-3.0.html). The repository includes the full license text.
How do I use Flounder with Codex or Claude Code?Install the Flounder skill once, then ask a skills-aware coding agent to audit an authorized target, verify a suspected finding, confirm a real finding, or collect the final report package. The dashboard, CLI, and REST API are control surfaces; the skill is the recommended way to drive the workflow.
Is Flounder a scanner?No. The agent owns the audit strategy and target-specific reasoning. Flounder supplies the sandbox, command policy, durable state, execution gates, daemon control plane, and report package so the agent’s work can be resumed, checked, and proven.
Will Flounder use a lot of tokens?High-quality audits can be token-heavy. You can cap map, dig, and confirm budgets, but hard caps can stop a productive investigation. The default is unbounded: the agent stops when the work is done, and interrupted runs can resume. For serious use, plan around high-cap subscriptions such as ChatGPT Pro or Claude Max 20x, or set explicit budgets for API/pay-as-you-go usage.
Does my source code leave my machine?Flounder keeps its database, artifacts, workspaces, and provider auth under local control, with default state under~/\.flounder. Provider credentials stay on the executor host. Your chosen model provider still receives the prompts and context your agent sends, so keep sensitive material out of scope unless that provider and account are approved for it.
What do I need to run a real audit?Node.js 24.13 or newer on the current 24 LTS line, a skills-aware agent, the Flounder skill, a configured model provider on the daemon, and a sandbox backend. For execution-backed audits, use Docker or a Docker-compatible runtime with the Flounder sandbox image or a target-specific image. Host mode is for trusted local smoke tests.
What targets are a good fit?Flounder fits source audits where claims can be proven locally: repositories, packages, smart contracts, Solidity/EVM projects, ZK/proof systems, suspected findings, transactions, addresses, and prior reports. It is strongest when the target has tests, forks, fixtures, or harnesses that can turn a vulnerability claim into command evidence.
Is it safe to run model-written exploit code?Model-written files and commands run in a copied workspace. The default OCI sandbox fails closed if the sandbox image is missing, instead of silently falling back to the host. Use host execution only when you explicitly trust the target and the command environment.
Can Flounder be used on live targets?Only with authorization. Discovery stays sealed and local. Confirmation may fetch, search, fork, or read real-world ground truth, but it must never broadcast, move funds, submit writes, persist access, or go outside the approved scope.
Give your agent an authorized target.
Flounder turns the request into a sandboxed, evidence-gated audit workflow.
$npx skills add adshao/flounder \-g
Similar Articles
@vintcessun: Alibaba open-sourced a code review tool. The core idea is interesting — a hybrid architecture of deterministic engineering + Agent. Common issues with pure LLM review: incomplete coverage, line number drift, and unstable quality. It uses a deterministic pipeline for file selection, grouping, and rule matching, while the Agent is only responsible for dynamic decision-making and context...
Alibaba open-sourced Open Code Review, an AI code review CLI tool that adopts a hybrid architecture of deterministic engineering and Agent. It has been running internally for two years and has discovered millions of defects.
@apivixtls: Pre-open source core results (DeepSeek V4 Flash Driver) Before the project was open-sourced, I used https://github.com/zhaoxuya520/reverse-skill... to complete multiple high-difficulty reverse engineering and security research tasks, fully verifying the model's capabilities in practical complex engineering...
The author @apivixtls released the reverse-skill tool, an AI Agent workflow routing and tool orchestration system designed specifically for reverse engineering and security analysis, and demonstrated its powerful capabilities in scenarios such as Go disassembly, APK decompilation, and Web vulnerability exploitation.
@mylifcc: The ultimate AI security red teaming tool is here! I just discovered an incredibly hardcore open-source project — DeepTeam! Produced by Confident AI, it is an LLM Red Teaming framework built on DeepEval, specifically designed to 'hack' your own large models: 50+ real-world vulnerabilities…
Confident AI has released DeepTeam, an open-source LLM red teaming framework that supports 50+ vulnerability detections and 20+ adversarial attacks, aimed at helping developers safely test large language models.
@Xudong07452910: Open-Source Search Tool Recommendation: "Agent Reach" — Give Your AI Agent Eyes Across 15 Platforms, Completely Free. Agent Reach Solves a Very Practical Problem: Your AI Agent Wants to Search Information on Twitter/Reddit/YouTube/G…
Agent Reach is an open-source command-line tool that provides a unified free interface for AI Agents, covering deep search capabilities across 15+ platforms including Twitter, Reddit, and YouTube, with no API fees required. It has already gained 21.7k+ stars.
@yaojingang: Open-sourced a website scanning skill: yao-websecurity-skill. I've learned that at least three public companies have deployed GEOFlow, and many friends have done various secondary developments based on this system, including commercial SaaS versions. Its security issues need to be taken seriously. Additionally, more and more...
Open-sourced yao-websecurity-skill, an AI-based website security audit skill. It includes 275 security checks, supports static and dynamic audit modes, and automatically generates security scoring reports to help developers discover and fix security risks.