- → Three-stage cascading supply chain attack: Trivy (vulnerability scanner) → LiteLLM (PyPI package) → Mercor (production environment)
- → Malicious LiteLLM versions 1.82.7 and 1.82.8 were live on PyPI for ~40 minutes. Automated CI/CD pipelines pulled them instantly.
- → Lapsus$ claims ~4 TB of data: 939 GB source code, 211 GB user database, ~3 TB video interviews with face/voice biometrics, passport scans, KYC documents
- → Lapsus$ gained full Tailscale VPN access to Mercor's internal network
- → Unconfirmed report: a developer may have exposed production credentials through an AI coding assistant
- → Mandiant estimates the broader TeamPCP campaign has hit 500,000 machines and 1,000+ SaaS environments
- → Class-action lawsuit investigation is underway. This is the fourth major AI platform breach in four months.
This analysis draws from TechCrunch, The Register, SecurityWeek, Wiz, ReversingLabs, Datadog Security Labs, Cybernews, The Record, and Chinese-language coverage on LINUX DO (linux.do). Lapsus$ claims are unverified by independent forensics — the 4 TB figure and Tailscale VPN access come from Lapsus$'s own Telegram channel and leak site. Mercor has confirmed a "security incident" tied to the LiteLLM compromise but has declined to confirm or deny whether customer/contractor data was accessed or exfiltrated. We distinguish between verified technical details and unverified threat actor claims throughout. This analysis cross-references twelve sources across English and Chinese.
What Is Mercor, and Why Is This Breach Catastrophic?
Mercor is an AI-powered hiring platform founded in 2023 by Brendan Foody, Adarsh Hiremath, and Surya Midha. It uses AI to screen resumes, conduct video interviews, and match candidates with employers. Its client list reads like a who's-who of AI: OpenAI, Anthropic, and hundreds of enterprises that rely on Mercor to contract specialized domain experts — scientists, doctors, lawyers — for AI model training. The company facilitates over $2 million in daily payouts, has screened 300,000+ resumes, conducted 100,000+ video interviews, and was valued at $10 billion after a $350 million Series C led by Felicis Ventures in October 2025.
That is the business story. Here is the security story: Mercor possesses what might be the most dangerous combination of personal data in the AI industry. Not credit card numbers. Not email addresses. Your face. Your voice. Your passport. Your government-issued ID. Video recordings of you answering interview questions. KYC verification documents. Employment history. Skills assessments. Salary expectations. References. And all of it organized, indexed, and searchable by an AI system designed to make retrieval as fast and complete as possible.
You cannot reset your face. You cannot rotate your voice. You cannot issue a new biometric identity like you issue a new credit card. If your biometric data leaks, it leaks forever. There is no remediation. There is no "password reset" for your face. And Mercor just had approximately 3 terabytes of video interviews — containing face and voice biometrics of hundreds of thousands of people — claimed as stolen by one of the most prolific extortion groups on the planet.
The Attack Chain: Three Dominoes, Four Days, Four Terabytes
The Mercor breach was not a direct attack. Nobody phished a Mercor employee. Nobody found a zero-day in Mercor's application. The attackers never had to touch Mercor at all — until Mercor's own build pipeline invited them in. This is a three-stage cascading supply chain attack, and understanding how each domino fell is critical to understanding why your organization is almost certainly vulnerable to the same pattern.
Stage 1 — Poisoning the Vulnerability Scanner (Late February – March 19)
TeamPCP — the threat group behind the campaign — first compromised
Aqua Security's Trivy, one of the most trusted open-source vulnerability
scanners in the industry. They exploited a pull_request_target
workflow vulnerability in Trivy's GitHub Actions configuration to exfiltrate the
aqua-bot service account credentials and rewrite Git tags.
They simultaneously compromised Checkmarx's KICS static analysis tool using the same pattern.
Let that sink in: two of the tools you trust to find vulnerabilities in your code were themselves compromised. The fox was not in the henhouse. The fox was the henhouse.
Stage 2 — Weaponizing LiteLLM via PyPI (March 24)
Using an API token exposed via the Trivy compromise, TeamPCP obtained the PyPI publishing credentials for LiteLLM — linked to the GitHub account of LiteLLM's co-founder/CEO. They published two malicious versions to PyPI:
| Version | Payload mechanism | Trigger |
|---|---|---|
| 1.82.7 | Double-base64-encoded payload dropped to disk as p.py | Executes when litellm --proxy is run |
| 1.82.8 | Abuses Python's .pth file mechanism (litellm_init.pth) | Runs on every Python interpreter startup — no import required |
Version 1.82.8 is the nightmare scenario. The .pth
file executes arbitrary code every time Python starts, regardless of whether
LiteLLM is imported. Install it once, and every Python process on the machine becomes a
credential harvester. The payload targeted:
- SSH keys
- AWS, GCP, Azure cloud credentials
- Kubernetes configs
- CI/CD secrets
- Docker configs
- Database credentials
- Cryptocurrency wallets
.envfiles
Everything was encrypted with AES-256, the key encrypted with an embedded RSA public key,
and exfiltrated to attacker-controlled domains: checkmarx[.]zone
and models[.]litellm[.]cloud.
The malicious packages were live for approximately 40 minutes — published
~8:30 UTC, quarantined ~11:25 UTC. Forty minutes. That is all it takes when your CI/CD
pipeline auto-resolves the latest version.
Stage 3 — Mercor Falls, Lapsus$ Moves In (March 24–30)
Mercor, as one of thousands of LiteLLM users, pulled the poisoned package into their environment through automated dependency resolution. The credential harvester exfiltrated their secrets. Lapsus$ — an extortion group collaborating with TeamPCP alongside ransomware gangs CipherForce and Vect — used the harvested credentials to gain full access to Mercor's Tailscale VPN environment.
From inside the VPN, everything was reachable. The source code. The databases. The storage buckets containing years of video interviews. The Slack workspace. The ticketing system. Everything.
What Was Stolen: The Data That Cannot Be Unbreached
| Category | Size | Contents |
|---|---|---|
| Source code | 939 GB | Full platform source code |
| User database | 211 GB | Resumes, candidate profiles, employer data, user credentials |
| Storage buckets | ~3 TB | Video interviews (face + voice biometrics), passport scans, KYC identity verification |
| Internal systems | — | Slack communications, ticketing data, Tailscale VPN configs, keys and secrets |
Stop and think about what 3 terabytes of video interviews means. These are not text records. These are recordings of real people — their faces, their voices, their mannerisms — answering questions about their skills, their career goals, their salary expectations. Many of these people are scientists, doctors, and lawyers who contracted through Mercor to train AI models for OpenAI and Anthropic. Their biometric data is now in the hands of an extortion group that openly auctions stolen data to the highest bidder.
Credit cards expire. Passwords can be changed. Biometric data is permanent. Every person whose video interview was in those storage buckets now has their face and voice permanently available for deepfake generation, identity fraud, and social engineering attacks. Forever. There is no expiry date on a face.
The Ghost in the Machine: Did an AI Coding Assistant Leak the Credentials?
One detail in the reporting deserves special attention. An unconfirmed report suggests that "a developer may have exposed production credentials through an AI coding assistant linked to Anthropic."
This is unverified. But the attack pattern is real and documented: developers using AI coding assistants (Copilot, Cursor, Claude Code, ChatGPT) routinely paste terminal output, error messages, and configuration files into prompts. Those prompts traverse third-party APIs. If a developer pasted a stack trace containing a database connection string, or an error log containing a Tailscale auth key, that credential just left the building.
Even without the supply chain attack, this single vector — developers leaking
secrets through AI assistant context windows — is a ticking time bomb across
the entire industry. How many of your developers have pasted .env
files into an AI prompt? How would you even know?
The Broader Blast Radius: You Are Probably Already Compromised
Mercor is not special. Mercor is "one of thousands" of companies affected by the LiteLLM supply chain attack. Their own spokesperson said so. Mandiant's threat hunters estimate the TeamPCP campaign has exfiltrated data from 500,000 machines. Mandiant Consulting reports 1,000+ impacted SaaS environments — and their CTO predicted expansion to potentially 10,000 victims.
The question is not whether your organization installed the compromised LiteLLM versions. The question is whether you can prove you didn't. Can you right now, at this moment, tell me the exact version of every transitive Python dependency in your production environment? Can you confirm that no CI/CD pipeline, no developer laptop, no staging environment pulled LiteLLM 1.82.7 or 1.82.8 during that 40-minute window? If you cannot answer those questions with certainty, you do not know whether you are compromised.
| Date | Target | Vector | Downstream impact |
|---|---|---|---|
| Late Feb | Aqua Trivy + Checkmarx KICS | pull_request_target | Every CI/CD pipeline trusting these scanners |
| Mar 24 | LiteLLM (PyPI) | Stolen publishing credentials | 95M monthly downloads, 36% of cloud environments |
| Mar 25 | Telnyx | Same campaign pattern | Telecom infrastructure |
| Mar 30 | Mercor (via LiteLLM) | Auto-resolved poisoned dependency | 4 TB data, 300K+ candidate biometrics |
| Mar 31 | npm axios | Cross-ecosystem spread | JavaScript ecosystem now under attack |
The gap between attacks is shrinking. The ecosystem spread is widening. Python today. JavaScript today. What language tomorrow? What package? What scanner? What hiring platform? What customer database?
The Chinese Perspective: 给OpenAI和Anthropic训练模型的公司被黑了
The Chinese-language discussion on LINUX DO (linux.do) cut straight to the geopolitical nerve that English coverage danced around. The thread title translates to: "The company training models for OpenAI and Anthropic got hacked: Mercor confirms attack, Lapsus$ claims 4 TB data theft."
The Chinese security community immediately flagged the training data supply chain implications. Mercor contracts domain experts — scientists, doctors, lawyers — to provide training data for frontier AI models. If Lapsus$ exfiltrated the training interaction data alongside the biometrics, the compromised data potentially includes proprietary model training methodologies, expert annotations, and RLHF preference data that companies like OpenAI and Anthropic paid for. The biometric data is the headline. The training data contamination vector is arguably the more dangerous long-term consequence.
The ti.dbappsecurity.com.cn threat intelligence platform (安全星图平台) referenced the incident in a broader 2026 cybersecurity threat predictions analysis, categorizing it as evidence of an accelerating pattern: AI companies are building the most sensitive data honeypots in history while running security practices designed for the pre-AI era.
The Root Causes: Five Failures That Destroyed a $10 Billion Company's Security
1. Unpinned dependencies in CI/CD — the original sin
LiteLLM's CI/CD pipeline pulled Trivy from apt without pinning to a specific version or verifying a checksum. Mercor's pipeline auto-resolved the latest LiteLLM version without pinning. This is the single root cause that enabled the entire cascade. Pin to a SHA hash instead of a version string, and the compromised packages never enter your environment. It is that simple. And almost nobody does it.
2. No credential isolation — VPN keys in the blast radius
When the credential harvester executed, it found Tailscale VPN credentials in the environment. This means VPN authentication secrets were accessible to processes running in the CI/CD or development environment. VPN credentials should live in a hardware security module or isolated secrets manager — never in environment variables or config files that a compromised dependency can read.
3. Flat network behind the VPN — no segmentation
Once inside Mercor's Tailscale VPN, Lapsus$ accessed everything: source code, databases, storage buckets, Slack. This indicates minimal or no network segmentation behind the VPN boundary. The VPN was the castle wall, and once breached, the entire kingdom fell. Zero-trust architecture — where every service verifies every request independently, regardless of network position — would have limited the blast radius to whatever the compromised credential had explicit access to.
4. Biometric data stored without additional encryption layers
Three terabytes of video interviews containing face and voice biometrics were apparently accessible from storage buckets reachable via the VPN. Biometric data — the most sensitive, most irrevocable category of personal information — should be encrypted at rest with keys managed separately from the application environment. If the storage bucket encryption keys were in the same credential scope as the VPN access, the entire data protection model collapses on a single compromised secret.
5. Bearer token / API key authentication — no request-level binding
The stolen credentials granted access because they were bearer-style secrets — whoever holds the token gets the access, regardless of where, when, or how the request originates. There was no mechanism to verify that a request using a legitimate credential was actually being made by the legitimate service, for a legitimate purpose, with an unmodified payload. A stolen bearer token is a skeleton key. It works from anywhere, for anything, until someone notices it is missing.
This is the fundamental weakness that every major authentication scheme shares — and the one that modern per-request, context-bound authentication protocols are designed to eliminate. If every API request required cryptographic proof that the caller is who they claim to be, calling the endpoint they intend, with the body they constructed, within a narrow time window — a stolen credential from a CI/CD exfiltration would be worthless. The credential would be bound to a request that already completed. It could not be replayed. It could not be redirected. It could not be used for a different endpoint or a different payload.
What We Got Wrong (Red-Teaming Our Own Narrative)
| Claim | Reality check |
|---|---|
| "4 TB of data stolen" | This figure comes from Lapsus$'s own claims. No independent forensics have verified the volume or contents. Mercor has not confirmed or denied specific data access. Threat actors routinely inflate impact claims. |
| "Full Tailscale VPN access" | Also a Lapsus$ claim. The evidence is Lapsus$-provided samples (Slack data, two video recordings). These could demonstrate limited access rather than full network compromise. Without Mercor's forensic report, the actual scope is unknown. |
| "AI coding assistant leaked credentials" | A single unconfirmed report. The LiteLLM supply chain attack is sufficient to explain the credential theft without invoking an additional vector. This may be misinformation or conflation. |
| "Mercor is one of thousands" | Mercor's own statement. The Mandiant estimates (500K machines, 1,000+ SaaS environments) are from private threat intelligence briefings, not published reports. The actual downstream count is uncertain. |
A Pattern That Should End the "Move Fast and Break Things" Era
Four months. Four major AI platform breaches. Each one exploiting a different entry point, each one arriving at the same conclusion: the AI industry is building the most valuable data repositories in human history while protecting them with security practices from 2015.
| Incident | Entry point | Root cause | Data at risk |
|---|---|---|---|
| McKinsey Lilli | Unauthenticated API endpoints | SQL injection + IDOR + exposed API docs | 46.5M chat messages, 728K files, 95 system prompts |
| LiteLLM | Compromised vulnerability scanner | Unpinned CI/CD deps + stolen PyPI creds | SSH keys, cloud creds, K8s tokens across 500K machines |
| Google Gemini | Calendar invite with prompt injection | No input sanitization + over-privileged agent | Email forwarding, smart home control |
| Mercor | Poisoned transitive dependency | Supply chain + flat network + bearer auth | Biometric data of 300K+ people |
Notice the escalation. Chat messages. Cloud credentials. Smart home controls. Biometric identity. Each breach exposes data that is harder to remediate than the last. We are moving up the hierarchy of irreversible damage. The next breach will expose something worse. It always does.
What To Do Right Now. Not Next Sprint. Now.
The Uncomfortable Truth
The Mercor breach is not a story about a startup that got unlucky. It is a preview of what happens when the AI industry's velocity meets the real world's threat landscape. Mercor did not have to be specifically targeted. They did not have to make any extraordinary mistake. They used a popular Python library. Their CI/CD pipeline auto-resolved the latest version. That is it. That is all it took to lose the biometric identity data of hundreds of thousands of people.
The cascading supply chain attack model — compromise a scanner, steal publishing credentials, poison a popular package, harvest downstream secrets, sell them to extortion groups — is now a proven, repeatable, scalable playbook. TeamPCP executed it across Python and JavaScript in the span of one week. The next group will be faster. The next target will have more sensitive data. The next breach will make Mercor look like a warm-up.
And here is the detail that should terrify every CISO in the AI industry: the malicious packages were live for forty minutes. Not forty days. Not forty hours. Forty minutes. Your incident response playbook assumes hours of detection time. Your vulnerability management SLA is measured in days. The attackers needed minutes. Your security model is optimized for a threat velocity that no longer exists.
The era of "we'll fix it in the next sprint" is over. The sprint is too slow. The attackers are not waiting for your standup.
How Code Corgi Catches Supply Chain Attacks Before They Land
PhantomCorgi Invisible Threat Detection
How API Phantom Eliminates the Bearer Token Problem
PhantomCorgi AI Platform Security Shield
- TechCrunch — Mercor Cyberattack Tied to LiteLLM Compromise ↗
- The Register — Mercor Supply Chain Attack ↗
- SecurityWeek — Mercor Hit by LiteLLM Supply Chain Attack ↗
- Wiz — TeamPCP Trojanizes LiteLLM ↗
- ReversingLabs — TeamPCP Supply Chain Attack Spreads ↗
- Datadog Security Labs — LiteLLM and Telnyx Compromised on PyPI ↗
- Cybernews — Mercor Data Breach ↗
- The Record — Mercor Confirms Security Incident ↗
- TechStartups — LAPSUS$ Claims 4TB via Tailscale VPN ↗
- Neowin — Mercor One of Thousands Hit ↗
- CybersecurityNews — Mercor AI Data Breach ↗
- LINUX DO — 给OpenAI和Anthropic训练模型的公司被黑了 (Chinese) ↗
- ClaimDepot — Mercor Data Breach Lawsuit Investigation ↗
- HelpNetSecurity — LiteLLM PyPI Packages Compromised ↗