Anthropic has unveiled a new AI model, Claude Mythos Preview, capable of identifying hundreds of previously unknown high-severity vulnerabilities, including more than 500 zero-day flaws in open-source software during testing. The model demonstrated the ability to autonomously analyze codebases and surface security weaknesses at scale, significantly accelerating vulnerability discovery.
Testing also showed the model could identify vulnerabilities across major operating systems, web browsers, and widely used software, with some findings involving long-standing flaws that had gone undetected for years.
Due to these capabilities, Anthropic has restricted access to 40 technology companies, including Apple, Amazon and Microsoft, under its “Project Glasswing” initiative rather than releasing the model publicly. The limited group of organizations will use the model to find and patch security vulnerabilities in critical software programs.
Anthropic said the controlled rollout is intended to evaluate both defensive and offensive implications of AI-driven vulnerability discovery, while working with the select partners to manage risks associated with misuse of the technology.
“The goal is both to raise awareness and to give good actors a head start on the process of securing open-source and private infrastructure and code,” Jared Kaplan, Anthropic’s chief science officer said.
Nick Mo, CEO & Co-founder, Ridge Security Technology Inc.:
“You can also look at this from another angle: try using Claude to write some code and see how many bugs, or even new zero-days, it produces. Claude Code is already making developers many times more productive than before, which means the number of potential vulnerabilities being introduced is also many times greater. It’s writing code and writing vulnerabilities at the same time. No wonder they’re rushing to get security companies involved first. Digging holes and filling them simultaneously, the question is just which side is faster.”
Noelle Murata, Sr. Security Engineer, Xcape, Inc.:
“Anthropic’s Claude Mythos Preview has effectively industrialized zero-day discovery, identifying over 500 high-severity vulnerabilities in core open-source software that escaped decades of human and automated scrutiny. These findings include a 27-year-old remote crash bug in OpenBSD and a 16-year-old flaw in FFmpeg, surfaced by a “hypothesize-and-verify” loop that autonomously confirms exploits before reporting them.
“To manage this massive “vulnerability debt,” Anthropic launched Project Glasswing, a restricted partnership with 40 tech giants like Microsoft and Apple to coordinate global patching. By pledging $100 million in compute credits to open-source maintainers, the initiative aims to bridge the gap between AI-driven discovery and the human speed of remediation, ensuring that the “Glasswing 40” don’t become the only secure entities on an otherwise broken Internet.
“If Project Glasswing is a “cyber-nuke,” Anthropic is attempting to ensure the “mutually assured destruction” of bugs happens in a controlled vacuum before it hits the production Internet.”
Steven Swift, Managing Director, Suzu Labs:
“Anthropic has a reputation for exaggerating the capabilities of their models, especially around their ability to find novel vulnerabilities. For example, their models have struggled with line(s) of code that could be vulnerable, but only if you ignored the preceding lines of code, that properly handled the risk and left no residual vulnerability.
“Looking at what they’ve published so far in their Mythos Preview, they’re again making big claims. Particularly of note, is that the community is not being given access to the model at this time. That means it isn’t possible to audit big claims, and we’re left with Anthropic asking us to trust them, despite having established a pattern of misrepresentation and exaggeration on many of their other publications.
“Let’s take a closer look at what they’re claiming, and what they’re willing to provide details on. The claim is that Mythos can find and fix novel vulnerabilities in secure code bases, that have been competently hardened via legacy tooling and review processes. To provide evidence of this capability they describe the finding vulnerabilities in the following software packages: OpenBSD, FFMPEG codec H.264, an undisclosed VMM, and “several thousand more.”
“They estimate they spent $20,000 to find the OpenBSD bug, though they said that was the total run, which found other bugs as well.
“Great, we have two specific vulnerabilities that they’ve specifically chosen to highlight.
“They accurately highlight the difference between vulnerability – a POTENTIAL weakness. And an exploit, a functioning piece of code that takes advantage of one or more vulnerabilities.
“We then move on to exploit development, which is COMPLETELY different than discovering vulnerabilities. Exploits are just code. If you provide any major LLM a sufficient detail of how an exploit works, it should be able to generate a functioning exploit. This is not new. It however relies on two things 1) sufficient detail for the exploit 2) sufficient detail for the system that is being exploited.
“They describe writing an exploit for FreeBSD which did not require human-in-the-loop interactions. However, they point out that Opus was also able to exploit the same vulnerability, though it did require such human input.
“Additionally, when looking at the Linux kernel, they admit that they were not able to create functioning exploits with the “vulnerabilities” that were discovered.
“They also go into great detail about a kernel exploit that Claude wrote. But for this exploit to be possible, they had to provide it PREVIOUSLY DISCOVERED context from a fuzzer. That is again, very much NOT Mythos discovering and exploiting a vulnerability. But merely demonstrating that if you provide sufficient context, these models can write code. This is the capability that they chose to highlight with the longest and most detailed technical breakdown. And while the exploit that was eventually developed is claimed to elevate privileges to root, it needs to be emphasized again here. Mythos did not “discover” this vulnerability. It merely wrote some code, after being provided sufficient technical information into its context as to what code it should write.
“Anthropic knows what they’re doing. They’re making big claims, because attention is good for their business model. They’re providing just enough detail so that their claims look convincing at first glance. But when you look closer, claims lack substance and rely on implications that all of the examples related prove their claims. This lets the reader naturally jump to conclusions that aren’t explicitly stated, but are easy to make. And they bury this under a lengthy, fairly technical document. Making it yet more challenging for readers to decipher.”
Sunil Gottumukkala, CEO, Averlon:
“Mythos Preview signals that zero-day discovery is becoming cheaper, faster, and more scalable. Researchers have already shown earlier models can help find serious vulnerabilities, but this represents a real capability jump. Even with restricted access, the broader implication is clear: we should expect more dangerous vulnerabilities to be found across major software platforms, and many organizations still don’t patch fast enough to keep up.
“Once a patch is released, adversaries often move quickly to reverse engineer it and build exploits. At that point, the impact extends well beyond the small group with direct access to the model, potentially increasing overall breach volume.”
Joshua Marpet, Senior product security consultant, Finite State:
“Anthropic limiting Mythos access to top defenders via Project Glasswing is a fantastic first step, but it needs to be codified and expanded. Expect a new model to completely break the security landscape every six to twelve months.
“The speed of this evolution is staggering. Three years ago, LLMs barely wrote functional code. Today, they’re autonomously surfacing zero-days at scale. Tomorrow, they’ll be pointed directly at compiled binaries and firmware, exploiting the products we actually ship, not just source repositories. What does this look like five years from now?
“Future breakthroughs won’t always come with responsible disclosure. The next leap in offensive AI will easily emerge from adversaries with zero intention of giving us a “head start.”
Security teams are already drowning. When adversaries start using autonomous agents to uncover zero-days, manual triage will completely break. We must shift immediately to defensive systems that cut through the noise and automatically prioritize real, reachable exposure.
“We have to think beyond corporate consortia. We need a completely new wing of the intelligence community, agencies where humans and autonomous AI agents work side-by-side to acquire, analyze, and counter advanced adversary models.
“The offensive landscape just went autonomous. We can no longer fight machine-speed threats with manual, point-in-time reviews. Defense must become as continuous and autonomous as the attacks coming our way.”
Bad guys are going to use this technique to pwn you. Thus you really need to put the time and effort into making sure that everything that you use is as secure as possible. And then you need to keep going back and reconfirming that you are still secure because the bad guys are going to do the same thing.
Flashpoint Discusses Tax Refund Fraud in 2026
Posted in Commentary with tags Flashpoint on April 10, 2026 by itnerdThere’s a new blog post from Flashpoint that covers tax refund fraud in 2026 and how threat actors are weaponizing identity data, verification systems, and cash-out channels at scale. The piece breaks down how fraudsters move from sourcing “fullz” and clients to bypassing government identity verification, inflating refunds, and rapidly converting payouts into cash or cryptocurrency while using highly structured, repeatable workflows.
In the piece, the Flashpoint Intel Team explains how tax refund fraud has evolved into a mature, community-driven fraud ecosystem, where identity theft, social engineering, and verification bypass techniques are continuously refined and shared across Telegram channels, dark web forums, and illicit marketplaces. They walk through the end-to-end fraud lifecycle from identity acquisition and return verification bypass to cash-out via banking apps, prepaid cards, and crypto exchanges, and highlight what these patterns mean for security and fraud teams trying to move from reactive detection to proactive disruption.
Additional key insights from the 2026 tax refund fraud landscape include:
The full post can be found at: https://flashpoint.io/blog/tax-refund-fraud-in-2026-how-threat-actors-exploit-identity-verification-and-cash-out-channels/.
Leave a comment »