Meta pauses work with Mercor after supply chain breach raises risk to AI training data

As first reported by Wired on Friday, Meta has paused all work with AI data startup Mercor following a confirmed security breach linked to a supply chain attack involving the LiteLLM open-source project, which impacted thousands of organizations globally.

Mercor, which provides proprietary training data to major AI companies including Meta, OpenAI, and Anthropic, said it was among those affected and has launched an investigation with third-party forensic experts.

The breach raised concerns about potential exposure of sensitive AI training data and internal datasets, which are used to develop and fine-tune large language models. Reports indicate that Mercor’s systems were impacted as part of a broader compromise involving malicious updates to widely used AI tooling, though it remains unclear what specific data was accessed.

Michael Bell, Founder & CEO, Suzu Labs had this comment:

   “The Mercor breach is what happens when the companies building the most valuable AI models in the world outsource the creation of their training data to vendors running on Airtable and shared passwords. A single poisoned open-source package gave attackers VPN credentials, and from there they walked through Mercor’s systems and took 4TB of proprietary datasets, source code, and contractor PII.

   “We’ve been investigating these AI data vendors for months and found the same structural failures at Sama, Teleperformance, Scale AI, and Cognizant we see unrotated credentials, info-stealer infections on contractor endpoints, and access controls that don’t exist. The training data behind every major frontier model is sitting inside vendors that wouldn’t pass a basic security audit, and now that data is on an extortion site. This is a national security problem dressed up as a vendor management failure.”

Lydia Zhang, President & Co-Founder,Ridge Security Technology Inc. adds this comment:

   “This incident alerts us that AI training data should be treated as critical infrastructure, subject to stricter security scrutiny and regulation.

   “The breach also underscores the risks of relying directly on open-source projects in enterprise environments. Supply chain attacks, like the compromised LiteLLM library in this case, can introduce vulnerabilities at scale and expose highly sensitive data. 

   “At a minimum, enterprises should adopt thoroughly tested and commercially supported versions of such components, with stronger security guarantees and accountability.”

Noelle Murata, Sr. Security Engineer, Xcape, Inc. provided this comment:

   “Meta’s indefinite suspension of its partnership with Mercor underscores how the AI industry’s rush to outsource training data has effectively liquidated billions in proprietary methodology. By allowing a poisoned version of the LiteLLM gateway (versions 1.82.7 and 1.82.8) to persist in their environment, Mercor gifted attackers 4 TB of data, including the precise “secret sauce” protocols Meta and OpenAI use to tune their models.

   “This was not a sophisticated zero-day; it was a basic supply chain failure where a compromised security scanner (Trivy) was used to poison a niche dependency that nobody bothered to pin. For anyone surprised that an autonomous, interconnected AI stack would eventually expose sensitive data to the internet, the lesson is clear. 

   “If you are not auditing your data vendors for basic dependency hygiene, your IP is already public property. Defenders must immediately scan for litellm_init.pth files, which provide stealthy persistence on every Python startup, and rotate all LLM provider API keys and cloud tokens. Protecting training integrity now requires treating every AI data broker as a high-risk production endpoint and enforcing strict, pinned Software Bill of Materials (SBOM) standards.

   “If your AI supply chain is this leaky, you are not training a model; you are just broadcasting a technical manual to Lapsus$.”

Supply chain vulnerabilities are real. If your organization doesn’t take them seriously, your organization will get pwned. It’s as simple as that. And you can double that if AI is involved.

Leave a Reply

Discover more from The IT Nerd

Subscribe now to keep reading and get access to the full archive.

Continue reading