Tag
OpenAI explains why Codex Security deliberately avoids starting with SAST reports, instead analyzing repository architecture and validating findings directly. The approach addresses the core challenge that hardest vulnerabilities involve whether security checks actually work across transformation chains, not just data flow tracking.
OpenAI introduces Codex Security, an agentic application security tool now in research preview that identifies complex vulnerabilities with high confidence and actionable fixes while significantly reducing false positives and noise compared to traditional security tools.
OpenAI and Paradigm introduce EVMbench, a benchmark for evaluating AI agents' capabilities in detecting, patching, and exploiting smart contract vulnerabilities across 117 curated vulnerabilities from 40 audits. The benchmark demonstrates GPT-5.3-Codex achieving 71% on exploit tasks, significantly outperforming GPT-5's 33.3%, while detection and patching remain more challenging.
OpenAI announces Aardvark, an AI-powered agentic security researcher built on GPT-5 that automatically identifies, validates, and patches software vulnerabilities in codebases. The tool integrates with GitHub and development workflows to help security teams discover and fix vulnerabilities at scale.
Google DeepMind introduces CodeMender, an AI agent that automatically detects and fixes code security vulnerabilities using advanced reasoning and validation techniques. The system has already upstreamed 72 security fixes to open source projects over six months.
Anthropic has launched Project Glasswing, leveraging its advanced Claude Mythos model to help critical software organizations identify and fix vulnerabilities, with the goal of enhancing global software security through collective defense.