An open-source LLM called OpenMythos was trained for cybersecurity tasks using SFT and RLVR, with datasets available on HuggingFace. The model aims to reduce hallucinations and improve precision in security-related queries.
We built OpenMythos for the Build Small Hackathon an open-source LLM trained specifically for cybersecurity tasks. Wanted to share our training approach since the RLVR setup was non-trivial and might be interesting to people doing similar domain-specific fine-tuning. **The problem** General-purpose LLMs are surprisingly bad at security. They hallucinate CVE details, miss real vulnerability patterns in code, and sound confident while being wrong in ways that matter. We wanted something that actually had security domain depth baked in. **Data** * Scraped 10K ArXiv cs.CR papers → filtered to \~1.84K high-quality records focused on coding vulnerabilities * Structured CVE dataset with real affected code and remediation context * Both open on Hugging Face (all links at end of this post) **Training pipeline** *Stage 1 - SFT* Standard supervised fine-tuning on cybersecurity tasks: vulnerability identification, CVE explanation, code review for security issues, mitigation strategies. *Stage 2 - RLVR* This is where it got interesting. SFT teaches the model to imitate good responses, but doesn't make it verify its own outputs. For security that gap is dangerous. We built a reward setup using GitHub repos with paired vulnerable/fixed branches. A verifier model checks each generated response against ground truth did it identify the right vulnerability? Is the fix actually correct? The reward signal flows from there. Post-RLVR the model got noticeably more precise. Less conflation of similar vuln classes, better calibration on uncertainty. **Links** * 🤖 Demo: [https://huggingface.co/spaces/build-small-hackathon/OpenMythos](https://huggingface.co/spaces/build-small-hackathon/OpenMythos) * 🧠 Model: [https://huggingface.co/build-small-hackathon/OpenMythos](https://huggingface.co/build-small-hackathon/OpenMythos) * 📦 CVE Dataset: [https://huggingface.co/datasets/build-small-hackathon/CVE\_Vulnerailities\_Detailed](https://huggingface.co/datasets/build-small-hackathon/CVE_Vulnerailities_Detailed) * 📄 ArXiv Filtered: [https://huggingface.co/datasets/himanshu17HF/ArvixImport-Filtered-Final](https://huggingface.co/datasets/himanshu17HF/ArvixImport-Filtered-Final) Happy to go into detail on the RLVR setup or the filtering pipeline if anyone's curious. We're also looking for feedback on where the model falls short.
Mythos demonstrates strong performance in cybersecurity hacking, achieving 18 out of 41 n-day exploits compared to 1 for version 5.5, while open-source models get none.
The article clarifies that the AI model Mythos was not trained on hacking, and predicts that other AI labs will eventually achieve similar capabilities.
The author tests whether other AI models can match Mythos's exceptional ability to find security vulnerabilities, creating a benchmark of bugs found by Mythos and testing models like Opus. Initial results suggest Mythos may be uniquely powerful.
Anthropic is preparing to launch Mythos 1, a powerful AI model for cybersecurity, with preview labels appearing in Claude Code and Claude Security. The model, initially restricted, may see broader release once safeguards are in place, and a new security dashboard is being developed.
OpenAI researchers study worst-case frontier risks of releasing open-weight LLMs through malicious fine-tuning (MFT) in biology and cybersecurity domains, finding that open-weight models underperform frontier closed-weight models and don't substantially advance harmful capabilities.