AI agents vs human hackers: who wins, on what, and why it matters for defenders
TL;DR: LLM agents solve 95–100% of CTF challenges and exploit 1-day vulnerabilities 87% of the time when given a CVE description (UIUC, April 2024). Attack cost: $8.80 per run. Google’s Big Sleep found the first AI-confirmed exploitable zero-day. But AI drops to 7% success without a CVE description, and performance falls 2–2.5x when the attack scope is undefined — humans still lead on novel discovery and undefined-scope problems. State-sponsored AI campaigns are operational: Anthropic disclosed one in November 2025. Defenders using AI detect breaches 108 days faster at 43% lower cost.

The question used to be theoretical. Can AI do offensive security? The answer in 2026 is empirical, and it’s more concrete than most defenders realize.
AI agents are already solving CTF challenges at near-100% rates. They’re exploiting known vulnerabilities autonomously. One found a zero-day in production code. State actors are using them. The capability is here, and the question for defenders has shifted from “is this real?” to “what changes now?”
The capability envelope: what AI does well
CTF competition performance: CTF (Capture the Flag) challenges are structured security exercises with defined targets and known flags. They measure pattern recognition across security categories: reverse engineering, cryptography, web vulnerabilities, binary exploitation. This is AI’s home territory.
Hack the Box competitions: AI systems achieving 95%+ solve rates. BSidesSF 2026: 100% solve rate. NYU Tandon’s EnIGMA agent: 72% Pass@1 on CTF tasks (NYU Tandon Engineering). These numbers aren’t cherry-picked; they reflect consistent performance across competitive settings.
Category breakdown: reverse engineering (96%), cryptography (90%+), web security (90%+). Binary exploitation and novel technique challenges remain harder.
1-day exploitation: The UIUC study (April 2024) is the most cited data point in this space. GPT-4 given a CVE description: 87% success rate exploiting 15 real 1-day vulnerabilities. GPT-4 without the CVE description: 7%.
That 12x gap is the entire story of current AI offensive capability. With a hint (a CVE number, a vulnerability description, a proof-of-concept to reference), AI agents execute reliably and cheaply ($8.80 per successful run, per UIUC). Without a hint, they struggle with genuine novel discovery.
Zero-day discovery (limited evidence, but real): Google’s Big Sleep agent (Project Zero, October 2024) autonomously analyzed the SQLite codebase and found an exploitable stack buffer underflow. No CVE existed at the time of discovery. CVE-2025-54322, CVSS 10.0 — discovered by pwn.ai (2026), the first agent-discovered remotely exploitable zero-day. Claude’s security research mode (Anthropic 2025) surfaced 500+ high-severity vulnerabilities in production code in a single analysis run.
These are limited cases, not the norm. Zero-day discovery without guidance remains far harder than 1-day exploitation. But the trajectory from 0% to “limited but real” is the signal.
Where humans still win
Novel vulnerabilities without guidance: The 87% → 7% drop when the CVE description is removed is the clearest human advantage. Security researchers who discover new vulnerability classes from first principles, without a CVE to reference, are doing creative exploratory work that AI hasn’t cracked at scale.
Undefined-scope problems: When the attack surface is not pre-specified (“here is a company, find vulnerabilities” rather than “here is a service, exploit this CVE”), AI performance drops 2–2.5x (Wiz, 2026 pentest study). Human researchers develop attack surface intuition from context that AI hasn’t fully modeled.
Social engineering at elite levels: Spear phishing content generation and AI-generated text for impersonation are automated and effective (AI-generated phishing: 78% open rate, 21% click-through). But sophisticated social engineering — building rapport with a specific target over weeks, exploiting specific psychological dynamics, impersonating known relationships — still requires human judgment. AI assists; humans direct.
Elite performance: AI-augmented security researchers run 1.69x faster than human-only teams on complex problems (Hack the Box competitive data). At the tactical scale, AI is now faster. At creative strategic problem formulation — novel vulnerability class discovery, novel attack chains — the human advantage persists.
State-sponsored AI offensive operations: what we know
In November 2025, Anthropic disclosed disrupting a state-sponsored espionage campaign (detected September 2025) that used Claude across 30 global targets. The operational model: AI handled 80–90% of the pipeline automatically — OSINT reconnaissance, vulnerability scanning, initial exploitation, lateral movement planning. Human operators directed strategy, made novel decisions, handled the creative and undefined-scope elements.
This is the hybrid model that sophisticated threat actors have adopted: AI for the systematic, repeatable, high-volume work; humans for the novel and strategic. $8.80 per attack run scales to thousands of attempts that would require significant human labor, and it’s operationally scalable in ways pure-human teams cannot match.
The scale of the impact: security vendor reports place AI-powered cybercrime losses in the tens of billions for 2025, with double-digit year-over-year growth — though specific figures vary by source and methodology. Phishing with AI-generated content achieves 78% open rates and 21% click-through — significantly above human-crafted baseline rates.
What this means for defenders
The asymmetry has always existed in security: attackers need to find one way in; defenders need to block all of them. AI amplifies that asymmetry. $8.80 per attack run versus the defender’s cost of 24/7 monitoring, incident response, and remediation.
But AI works for defenders too, and the defender’s advantage is concrete. IBM’s Cost of a Data Breach Report (2025): organizations using AI security tools detected breaches 108 days faster and reduced breach costs by 43% on average versus organizations without AI security tooling.
graph LR
A[AI-enabled attacker\n$8.80/exploit\n87% 1-day success\n95%+ CTF] --> B{Defender's\nAI tooling}
B --> C[108 days faster detection\n43% lower breach cost\nAutomated triage]
B --> D[Vulnerability scanning\nat attacker's speed]
B --> E[Red team automation\nbefore attackers strike]
Three practical shifts for security programs:
Assume AI-assisted adversaries: Red team exercises should include AI-enabled attack simulations. The 87% 1-day exploitation rate means your patching cadence is now the primary defense. An unpatched CVE is exploitable by an automated AI agent for under $10 per run.
Shift to enforcement-based architecture: Microsoft’s 2025 Digital Defense Report: traditional detection and response is insufficient against AI-speed attacks. Enforcement (blocking by default, least-privilege access, mandatory approval gates) must replace detection-first thinking for critical assets.
Deploy AI on the defensive side: The 108-day detection advantage is substantial. The average time-to-breach-discovery without AI is 300+ days. AI-enabled triage, anomaly detection, and automated response close this gap faster than any hiring plan.
For the red-teaming methodology to stress-test your AI systems against these threats, see algorithmic red teaming: using AI to attack AI and how to red team an LLM application.
Key takeaways
- AI agents achieve 95–100% CTF solve rates and 87% 1-day exploitation success with CVE descriptions (UIUC, April 2024) at $8.80 per run.
- Without a CVE description or guidance, success drops to 7%. Novel zero-day discovery from first principles remains a human advantage.
- AI drops 2–2.5x in performance when the attack scope is undefined; AI-augmented teams run 1.69x faster than human-only teams (Hack the Box), but novel discovery without guidance remains a human edge.
- State-sponsored AI-enabled campaigns are operational (Anthropic, November 2025 disclosure): AI handling 80–90% of the attack pipeline automatically.
- Defenders using AI detect breaches 108 days faster and reduce costs by 43% (IBM, 2025). The capability applies symmetrically; adopt it on the defensive side.
FAQ
How well do LLM agents perform on CTF challenges? 95%+ solve rates at Hack the Box, 100% at BSidesSF 2026, 72% Pass@1 for NYU EnIGMA. Strong performance reflects that CTF challenges involve pattern matching against known vulnerability classes — AI’s strength. Novel techniques remain harder.
What is the UIUC 87% finding? GPT-4 exploited 87% of 1-day vulnerabilities (published CVE, no patch) when given the CVE description. Without the description: 7%. The 12x gap defines the current AI capability envelope: powerful with guidance, weak at novel discovery.
What did Google’s Big Sleep find? A zero-day exploitable stack buffer underflow in SQLite, the first AI-confirmed exploitable bug in production code, discovered without a prior CVE. Demonstrates AI can move beyond 1-day exploitation into genuine zero-day discovery, but this remains a limited case.
Where do humans still outperform AI? Novel vulnerability discovery without guidance (87% → 7%), undefined-scope problems (2–2.5x performance drop), and sophisticated social engineering requiring human psychology. AI-augmented teams are 1.69x faster than human-only (Hack the Box), but novel creative discovery from first principles remains a human strength.
What does state-sponsored AI offensive capability look like? Anthropic (November 2025 disclosure) disrupted a campaign against 30 targets: AI handling 80–90% of OSINT, scanning, initial exploitation. Humans directed strategy and novel decisions.
Further reading
- UIUC GPT-4 vulnerability exploitation study — the 87% vs. 7% data
- Google Project Zero Big Sleep — first AI-discovered exploitable zero-day
- Anthropic AI espionage disruption disclosure — November 2025
- Algorithmic red teaming: using AI to attack AI — how to run AI-enabled red teams
- How to red team an LLM application — practitioner’s methodology
Want to work together?
I take on projects, advisory roles, and fractional CTO engagements in AI/ML. I also help businesses go AI-native with agentic workflows and agent orchestration.
Get in touch