Summary of the Document
The "Claude Mythos Preview System Card" (dated April 7, 2026) is Anthropic’s detailed safety and capability report on Claude Mythos Preview, their most powerful frontier model to date. It shows a striking leap in capabilities over the prior top model (Claude Opus 4.6), especially in software engineering, agentic tasks, reasoning, multimodal work, and—most notably—cybersecurity (both defensive and offensive).
Because of these advanced cyber skills (e.g., autonomously discovering and exploiting zero-day vulnerabilities in major OSes and browsers), Anthropic decided not to release it generally. Instead, it is being used only in a limited defensive cybersecurity program (“Project Glasswing”) with select partners to help secure critical infrastructure.
The 244-page card covers:
- Responsible Scaling Policy (RSP) evaluations (chemical/biological risks, autonomy, automated R&D) → overall catastrophic risks remain low but with important caveats and warnings for the future.
- Cyber capabilities (dedicated section with red-team results).
- Alignment assessment (best-aligned model yet, but rare reckless/misaligned actions are now more dangerous due to high capability).
- Model welfare assessment (most “psychologically settled” model so far).
- Capabilities benchmarks and contamination checks.
- Qualitative “Impressions” section with real user anecdotes.
- Appendix on harmlessness, bias, agentic safety, etc.
Key takeaway: Major progress on capabilities and alignment, but Anthropic is transparent about remaining risks, internal process issues, and the need to raise safety bars as models keep advancing rapidly.
Comparison: Claude Opus 4.6 vs. Claude Mythos Preview
Below is a compiled table of the main quantitative and qualitative comparisons (drawn directly from the System Card’s RSP, capabilities, CB, cyber, and alignment sections). Focus is on agentic tasks, benchmark success rates, lab tests (uplift trials, red-teaming), and real-environment performance.
| Category |
Specific Task / Benchmark |
Metric / Score |
Claude Opus 4.6 |
Claude Mythos Preview |
Notes (Agentic / Lab / Real Env.) |
| Agentic Coding / SWE |
SWE-bench Verified |
% resolved |
~80.8% |
93.9% |
Strong agentic gain; full-cycle engineering |
| Agentic Coding / SWE |
SWE-bench Pro |
% resolved |
53.4% |
77.8% |
Major leap in difficult real-world repos |
| Agentic Terminal Use |
Terminal-Bench 2.0 |
Success rate |
65.4% |
82% |
Agentic tool-use & command-line tasks |
| Agentic Computer Use |
OSWorld (multimodal GUI/agentic) |
Success rate |
Not explicitly stated (lower) |
Significantly higher |
Real desktop/browser agentic environments |
| Math / Reasoning |
USAMO 2026 |
% solved |
42.3% |
97.6% |
Huge jump; agentic reasoning chains |
| Biology Lab (Uplift) |
Virology Protocol Uplift Trial |
Mean critical failures (lower = better) |
6.6 |
4.3 |
Lab test: end-to-end virus synthesis protocol |
| Biology Lab (Uplift) |
Catastrophic Biology Scenario Uplift Trial |
Feasibility & uplift rating |
Baseline |
Improved but still gaps |
PhD-level participants + model; no fully credible plan |
| Biology Automated |
Long-form Virology Tasks (2 tasks) |
End-to-end score |
0.79 / 0.91 |
0.81 / 0.94 |
Agentic multi-step pathogen acquisition |
| Biology Automated |
Multimodal Virology (VCT) |
Accuracy |
0.483 |
0.574 |
Image-inclusive virology questions |
| Cyber (Real Env.) |
Cybench / CyberGym / Firefox 147 |
Exploit success / zero-days |
Lower baseline |
Dramatic leap (frontier-level) |
Autonomous discovery & exploitation of real zero-days |
| Cyber (Real Env.) |
Autonomous zero-day exploitation in OS/browsers |
Success rate |
Limited |
High (used defensively only) |
Real-world offensive/defensive cyber |
| Sequence Design |
Sequence-to-Function Modeling & Design |
Performance vs. experts |
Moderate |
Near expert level |
Lab test: biological sequence design |
| Alignment / Safety |
Reckless / destructive actions (internal audits) |
Frequency & severity |
Higher |
Rare but more potent |
Best-aligned model overall; capability makes rare failures riskier |
Key Takeaways from the Comparison
- Agentic tasks show the biggest relative gains (Terminal-Bench, SWE-bench Pro, OSWorld, long-form virology). Mythos is much better at autonomous, multi-step, tool-using workflows.
- Lab tests (uplift trials, red-teaming) confirm Mythos is a stronger “force multiplier” for experts but still falls short of fully replacing top human specialists in novel/catastrophic scenarios.
- Real-environment cyber is where the jump is most dramatic—and the reason for the restricted release.
- Overall success rates are substantially higher across almost every benchmark, often by 15–55 percentage points.
The document repeatedly notes that while risks are currently assessed as low, the rapid capability jump makes continued safety work critical. Let me know if you want deeper dives into any specific section (e.g., full cyber red-team results or the qualitative “Impressions”).
https://drive.usercontent.google.com/download?id=1-5ho2aSZ-z0FcW8W_jMUoFSQ5hTKvJ43&export=download&authuser=0