Claude Mythos AI Finds 10,000 High-Severity Flaws in Widely Used Software
https://thehackernews.com/2026/05/claude-mythos-ai-finds-10000-high.htmlOpen linkView original on sh.itjust.works
https://thehackernews.com/2026/05/claude-mythos-ai-finds-10000-high.htmlOpen linkView original on sh.itjust.works
Let's not bury that image content.
Why only review 1900? How were these chosen? Were the 1259 that were not reported to maintainers just duplicates or were they even valid?
They just spammed the maintainers with these without reviewing them?
Does acknowledged mean they said they received the report or does it mean they validated the report? Because it looks a lot like "received", when accounting for that prior 1259 gap and the fact the bulk of them weren't reviewed prior to sending.
But that 1726 was reduced to 467 come reporting time. Which makes that 17% hit rate possibly... 4.7%?
MYTHOS IS TOO POWERFUL TO RELEASE /s
Yeah they re-release this "news" with fudged numbers every so often it seems, and the media just eats it up and shits it back out with about as much time and care as they took auditing the software.
I assume this article is slop. It contradicts 10k high sev by paragraph 3, not even Anthropic claims it in their media release, which contains even sadder numbers.
Because of how time works...
It just throws out thousands of redflags, then humans need to review those bits.
But it's a "one time thing" what matters more than the first time it's run, is how many new redflags it generates over a time period.
I'd also be interested to see how this performs versus the same amount of manpower hours of code review.
Like, they're gonna find shit, even if looking at random code.
So ideally we'd need to take an identical code as and audit it:
AI telling humans where to look
Humans looking randomly
Humans telling humans where to look
And I feel like once again we're gonna see that the only real benefit of "AI" as we know it, is doing the initial basic steps. It's heavily biased to false redflags, but we want that because a false positive is just wasting some hours looking into it, a false negative means it goes uncaught.
If it had a 100% identify rate, it would just mean it's definitely not catching everything.
But again, that confusion is because people keep acting like it can replace humans instead of assisting them.