@AdamShao: Officially open-sourcing my vulnerability discovery tool: http://flounders.xyz This is an AI Agent-based fully automated vulnerability discovery workflow. You just tell the AI which project's vulnerabilities you want to find, and it will automatically download code and documentation, deeply audit the code, discover suspicious vulnerabilities, automatically verify them locally and online…

X AI KOLs Timeline 06/24/26, 03:42 AM Tools

open-source security-tool ai-agent vulnerability-discovery white-hat automated-audit

Summary

Flounder is an open-source AI agent-based tool that automates vulnerability discovery in codebases. Users describe the target and the tool autonomously downloads code, conducts deep code audits, tests vulnerabilities locally and online, and generates reports.

Officially open-sourcing my vulnerability discovery tool: http://flounders.xyz This is an AI Agent-based fully automated vulnerability discovery workflow. You just tell the AI which project's vulnerabilities you want to find, and it will automatically download code and documentation, deeply audit the code, discover suspicious vulnerabilities, automatically verify them locally and online, and finally generate a report. Alpha leak: If you have many AI tokens that you don't know what to do with, you can give the agent a goal to search for bounty tasks on major white-hat platforms, find vulnerabilities, and earn bounties.

Original Article

View Cached Full Text

Cached at: 06/24/26, 10:22 AM

I’ve officially open-sourced my vulnerability discovery tool: http://flounders.xyz

This is a fully automated vulnerability discovery workflow based on AI Agent. Just tell the AI which project’s vulnerabilities you want to find, and it will automatically download the code and documentation, conduct in-depth code audits, discover suspicious vulnerabilities, automatically verify them locally and online, and finally generate a report.

Alpha leak: If you have plenty of AI tokens to spare, you can give the agent a goal to search for bounty tasks on various white-hat platforms, find vulnerabilities, and earn bounties.

Flounder — An autonomous white-hat security auditor

Source: https://flounders.xyz/ Agent-driven security audits

Install the skill once. Ask Codex, Claude Code, or another skills-aware agent to prepare the target, audit the code, run proof tests, and collect the report.

Install skill (https://flounders.xyz/#quickstart)View on GitHub (https://github.com/adshao/flounder)

Works withCodexClaude CodeGemini CLICursorOpenCodeOpenHands

$npx skills add adshao/flounder \-g

Flounder skill — agent-driven audit

$npx skills addadshao/flounder-g

◇ skill installed for Codex / Claude Code

›Audit this repository with Flounder.

◇ target authorized boundary captured

◇ agent prepared workspace · mapped scope · dug promising regions

↳test runnerreturnedPASSwith command evidence

confirmed-executablereport package ready

✓sealed audit complete · network stayed off

Use it with an agent

Ask naturally. Flounder handles the audit contract.

The installed skill triggers from Flounder audit requests, daemon/provider setup, suspected-finding verification, real-finding confirmation, and report collection.

1install skillone-time setup

2ask agentCodex or Claude Code

3provesandboxed local tests

4reportprivate disclosure draft

agent owns strategy·Flounder owns safety and evidence

01natural language

Codex / Claude Code driver

No custom scenario pipeline

Ask for an authorized audit, verification, confirmation, or report package. The Flounder skill gives the agent the operating manual and keeps it on the workflow.

The source of truth isskills/flounder/SKILL\.md, not a marketing-only prompt.

02execution-backed

End-to-end audit system

Prep → audit → proof → report

Flounder can prepare the workspace, read source and corpus, map attack surface, dig promising regions, construct exploit paths, run proof tests, and collect reports.

The framework supplies sandboxing, command policy, durable state, gates, daemon execution, and reporting.

Local dashboard

Track audits while the agent works.

flounder uigives operators a localhost control plane for projects, daemons, provider profiles, runs, scopes, findings, live activity, and reports.

Flounder dashboard showing an Aztec Rollup demo audit with workflow phases, scope coverage, live activity, candidates, and report-ready reproduced findings

Project view: prepare → map → dig → synthesize → verify → confirm → report, with live model activity and finding-grained report actions.Daemon-owned executionLive tool and model activityFinding-grained Verify / Confirm / Report

Why Flounder

Thin framework. Strong guarantees.

Flounder is not a scanner, checklist runner, or set of hand-written bug rules. The model decides how to reason; Flounder makes the result usable.

Agent-native

Install the skill once. Codex, Claude Code, or another skills-aware agent can drive the workflow from a plain request.

Framework-agnostic

Source, corpus, and optional profiles are inputs. The audit strategy comes from the model, not a stack-specific scanner.

Execution-grounded

A finding is not real because the model says so. It must cite command evidence from a passing local proof test.

Blind then real

Discovery runs network-sealed. Reproduction can use real-world ground truth under white-hat no-broadcast rules.

Sandbox boundary

Model-written tests, PoCs, dependencies, and commands run in a copied workspace away from the host checkout.

Local control

The UI is a control plane. Audits run on a daemon, so target code and provider credentials stay on the executor host.

Use cases

Use Flounder when a security question needs proof.

Choose the path by what you already have: a clean target, a factual clue, a public bounty scope, local source, a suspected finding, or confirmed evidence.

blind capability audit### Measure unaided audit ability.

Start with an authorized project, repo, package, source tree, or project link and no bug hint.

Input: target only, no incident writeup incident investigation### Explain a suspicious transaction or exploit clue.

Use Prepare to collect chain facts, deployed source, official material, and reproduction requirements.

Input: transaction, address, exploit link open-world bounty### Audit with official public context.

Let Flounder gather bounty scope, docs, deployments, provenance, and package metadata before sealed audit.

Input: public program, repo, deployment source-provided audit### Audit code that is already staged locally.

Provide source paths, build root, and optional corpus to enter sealed map/dig directly.

Input: source, build root, docs targeted follow-up### Settle one claim or region.

Verify suspected findings, dig selected scopes, confirm a run, or continue from prior project state.

Output: confirmed, refuted, or narrowed disclosure prep### Package only evidence-backed bugs.

Consolidate duplicates, run real-target confirmation when needed, and regenerate selected reports.

Output: reports, decisions, command evidence

Prepare targetMap scopeDig deeplyRun proofCollect report

Proof boundary

Execution is the promotion rule.

A candidate stays suspected until it cites a passing confirmation-eligible command. The status is a framework verdict from command evidence, not the model’s assertion.

refuted

The claim failed reproduction or skeptic review.

suspected

Credible, but no passing cited test yet.

confirmed-executable

A real local test/build runner passed.

confirmed-differential

The same exploit is blocked by its own minimal fix.

1Model-owned strategy Flounder is not a stack scanner or checklist runner. Source, corpus, and optional profiles are inputs, not conclusions.
2Sandboxed execution Commands run in a copied workspace. The default OCI backend fails closed if the sandbox image is missing.
3Real test runners only Inspection commands cannot mint proof. Confirmation needs a command likecargo test,forge test, orpytest.
4Local control The control plane queues work; the daemon executes it. Target code and provider credentials stay on the executor host.

Quickstart

Install once. Ask your agent.

The Flounder skill is the product interface for Codex, Claude Code, and other skills-aware agents.

1. Install Skill

add Flounder to your agent once

$ npx skills add adshao/flounder -g ``

Installs the operating manual, safety boundary, and workflow contract.

2. Ask Agent

use plain language from Codex or Claude Code

› Audit this repository with Flounder. › Verify this suspected finding with Flounder. › Collect the execution-backed bug report package. ``

The agent handles setup, audit planning, proof runs, and report collection.

Dashboard, CLI, and REST API remain available when you want direct control.

White-hat by construction.

Flounder is forauthorizedauditing only — your own code or public bug-bounty scope. Discovery is network-sealed; reproduction may fork and read live networks butneverbroadcasts, moves funds, or writes to any live system — exploits replay against alocalfork only. Build the smallest proof needed, report privately, coordinate disclosure.

Read the security policy → (https://github.com/adshao/flounder/blob/main/SECURITY.md)

FAQ

Practical questions before you run it.

Answers for operators setting up their first agent-driven audit.

Is Flounder a local service or a cloud service?Flounder is local-first. The dashboard and control plane run on localhost by default, and audits execute on a daemon you control. That daemon can be on your machine or another executor host you connect; Flounder does not require uploading targets to a hosted Flounder cloud.

Is Flounder open source? What license?Yes. Flounder is open source under theGNU AGPL v3 (https://www.gnu.org/licenses/agpl-3.0.html). The repository includes the full license text.

How do I use Flounder with Codex or Claude Code?Install the Flounder skill once, then ask a skills-aware coding agent to audit an authorized target, verify a suspected finding, confirm a real finding, or collect the final report package. The dashboard, CLI, and REST API are control surfaces; the skill is the recommended way to drive the workflow.

Is Flounder a scanner?No. The agent owns the audit strategy and target-specific reasoning. Flounder supplies the sandbox, command policy, durable state, execution gates, daemon control plane, and report package so the agent’s work can be resumed, checked, and proven.

Will Flounder use a lot of tokens?High-quality audits can be token-heavy. You can cap map, dig, and confirm budgets, but hard caps can stop a productive investigation. The default is unbounded: the agent stops when the work is done, and interrupted runs can resume. For serious use, plan around high-cap subscriptions such as ChatGPT Pro or Claude Max 20x, or set explicit budgets for API/pay-as-you-go usage.

Does my source code leave my machine?Flounder keeps its database, artifacts, workspaces, and provider auth under local control, with default state under~/\.flounder. Provider credentials stay on the executor host. Your chosen model provider still receives the prompts and context your agent sends, so keep sensitive material out of scope unless that provider and account are approved for it.

What do I need to run a real audit?Node.js 24.13 or newer on the current 24 LTS line, a skills-aware agent, the Flounder skill, a configured model provider on the daemon, and a sandbox backend. For execution-backed audits, use Docker or a Docker-compatible runtime with the Flounder sandbox image or a target-specific image. Host mode is for trusted local smoke tests.

What targets are a good fit?Flounder fits source audits where claims can be proven locally: repositories, packages, smart contracts, Solidity/EVM projects, ZK/proof systems, suspected findings, transactions, addresses, and prior reports. It is strongest when the target has tests, forks, fixtures, or harnesses that can turn a vulnerability claim into command evidence.

Is it safe to run model-written exploit code?Model-written files and commands run in a copied workspace. The default OCI sandbox fails closed if the sandbox image is missing, instead of silently falling back to the host. Use host execution only when you explicitly trust the target and the command environment.

Can Flounder be used on live targets?Only with authorization. Discovery stays sealed and local. Confirmation may fetch, search, fork, or read real-world ground truth, but it must never broadcast, move funds, submit writes, persist access, or go outside the approved scope.

Give your agent an authorized target.

Flounder turns the request into a sandboxed, evidence-gated audit workflow.

$npx skills add adshao/flounder \-g

Flounder — An autonomous white-hat security auditor

Ask naturally. Flounder handles the audit contract.

Codex / Claude Code driver

End-to-end audit system

Track audits while the agent works.

Thin framework. Strong guarantees.

Agent-native

Framework-agnostic

Execution-grounded

Blind then real

Sandbox boundary

Local control

Use Flounder when a security question needs proof.

Execution is the promotion rule.

1Model-owned strategy Flounder is not a stack scanner or checklist runner. Source, corpus, and optional profiles are inputs, not conclusions.

2Sandboxed execution Commands run in a copied workspace. The default OCI backend fails closed if the sandbox image is missing.

3Real test runners only Inspection commands cannot mint proof. Confirmation needs a command like`cargo test`,`forge test`, or`pytest`.

4Local control The control plane queues work; the daemon executes it. Target code and provider credentials stay on the executor host.

Install once. Ask your agent.

add Flounder to your agent once

use plain language from Codex or Claude Code

White-hat by construction.

Practical questions before you run it.

Give your agent an authorized target.

Similar Articles

@Xudong07452910: Open-Source Search Tool Recommendation: "Agent Reach" — Give Your AI Agent Eyes Across 15 Platforms, Completely Free. Agent Reach Solves a Very Practical Problem: Your AI Agent Wants to Search Information on Twitter/Reddit/YouTube/G…

Submit Feedback

Similar Articles

@vintcessun: Alibaba open-sourced a code review tool. The core idea is interesting — a hybrid architecture of deterministic engineering + Agent. Common issues with pure LLM review: incomplete coverage, line number drift, and unstable quality. It uses a deterministic pipeline for file selection, grouping, and rule matching, while the Agent is only responsible for dynamic decision-making and context...

@apivixtls: Pre-open source core results (DeepSeek V4 Flash Driver) Before the project was open-sourced, I used https://github.com/zhaoxuya520/reverse-skill... to complete multiple high-difficulty reverse engineering and security research tasks, fully verifying the model's capabilities in practical complex engineering...

@mylifcc: The ultimate AI security red teaming tool is here! I just discovered an incredibly hardcore open-source project — DeepTeam! Produced by Confident AI, it is an LLM Red Teaming framework built on DeepEval, specifically designed to 'hack' your own large models: 50+ real-world vulnerabilities…

@Xudong07452910: Open-Source Search Tool Recommendation: "Agent Reach" — Give Your AI Agent Eyes Across 15 Platforms, Completely Free. Agent Reach Solves a Very Practical Problem: Your AI Agent Wants to Search Information on Twitter/Reddit/YouTube/G…

@yaojingang: Open-sourced a website scanning skill: yao-websecurity-skill. I've learned that at least three public companies have deployed GEOFlow, and many friends have done various secondary developments based on this system, including commercial SaaS versions. Its security issues need to be taken seriously. Additionally, more and more...

Flounder — An autonomous white-hat security auditor

Ask naturally. Flounder handles the audit contract.

Codex / Claude Code driver

End-to-end audit system

Track audits while the agent works.

Thin framework. Strong guarantees.

Agent-native

Framework-agnostic

Execution-grounded

Blind then real

Sandbox boundary

Local control

Use Flounder when a security question needs proof.

Execution is the promotion rule.

1Model-owned strategy Flounder is not a stack scanner or checklist runner. Source, corpus, and optional profiles are inputs, not conclusions.

2Sandboxed execution Commands run in a copied workspace. The default OCI backend fails closed if the sandbox image is missing.

3Real test runners only Inspection commands cannot mint proof. Confirmation needs a command likecargo test,forge test, orpytest.

4Local control The control plane queues work; the daemon executes it. Target code and provider credentials stay on the executor host.

Install once. Ask your agent.

add Flounder to your agent once

use plain language from Codex or Claude Code

White-hat by construction.

Practical questions before you run it.

Give your agent an authorized target.

Similar Articles

@Xudong07452910: Open-Source Search Tool Recommendation: "Agent Reach" — Give Your AI Agent Eyes Across 15 Platforms, Completely Free. Agent Reach Solves a Very Practical Problem: Your AI Agent Wants to Search Information on Twitter/Reddit/YouTube/G…

Submit Feedback

3Real test runners only Inspection commands cannot mint proof. Confirmation needs a command like`cargo test`,`forge test`, or`pytest`.