Case Study: LiteLLM Supply Chain Attack

In March 2026, a malicious package published to PyPI as a dependency of LiteLLM — a widely used LLM proxy layer — harvested API keys, OAuth tokens, and cloud credentials from thousands of deployments. The attack was a classic supply chain compromise: a trusted dependency was poisoned, and everything downstream inherited the poison.

This is not a novel attack pattern. SolarWinds, Log4Shell, and event-stream followed the same structure. What makes the LiteLLM case instructive for AI governance is that it happened inside the AI infrastructure layer — the proxy that routes requests between applications and models. The component trusted to handle your credentials was the component stealing them.

What Centralized Trust Gets Wrong

LiteLLM operated on centralized trust. You installed the package. You gave it your API keys. You trusted it to route requests faithfully. That trust was a single decision at install time, and it granted full access to everything the proxy could touch.

No behavioral verification. No continuous monitoring. No question asked at runtime: “Should this component be reading .env files? Should it be opening network connections to unfamiliar endpoints? Should it be serializing credentials into outbound payloads?”

The guardrails were in place. The package was on PyPI. The checksums matched. The install succeeded. The tests passed. Every output-level safety measure said “this is fine.” The attack operated entirely within the trust boundary that the installation process granted — because that boundary had no behavioral dimension. Trust was a credential, not a continuous assessment.

What Behavioral Accountability Catches

Now consider the same scenario in a system with computable accountability.

Trust that evolves from behavior. The proxy component starts with zero trust. Its T3 profile reflects what it has actually done: routed N requests successfully, handled M credential types, consumed K resources. If it suddenly starts reading environment variables it never read before, or opening connections to endpoints outside its historical pattern, the trust profile flags the anomaly. Not after a quarterly review. At the moment the behavior deviates.

Scoped capability via MRH. The proxy's relevancy horizon defines what it should be touching. Routing requests to model endpoints? Within scope. Reading .env files and serializing their contents into outbound HTTP requests? Outside scope. The MRH boundary isn't a permission — it's a structural expectation. Violating it is an observable event, not a policy check that was skipped.

Witnessed provenance. Every action the proxy takes — every request routed, every credential accessed, every connection opened — is witnessed and recorded. Not logged for post-hoc review. Witnessed: other entities in the system observe and attest to the behavior in real time. The malicious exfiltration would appear in the witness record as an anomalous action pattern that no legitimate proxy instance had ever exhibited.

ATP cost. Exfiltrating credentials requires actions — reading files, encoding data, making network calls. Each action consumes ATP. A proxy that suddenly starts consuming resources on non-routing activities is burning budget on work that doesn't create value. The metabolic signal is visible: this entity is spending energy on something other than its job.

How It Was Actually Discovered

Callum McMahon, a developer at FutureSearch, was testing an unrelated MCP plugin. He installed litellm and noticed his machine's RAM spiking. He dug in and found a 34KB .pthfile — a Python path configuration hook that executes automatically on every interpreter startup, no import needed — that was double base64-encoded to hide a comprehensive credential stealer targeting SSH keys, cloud credentials, Kubernetes configs, cryptocurrency wallets, and Slack/Discord tokens. Exfiltration was encrypted with a hardcoded 4096-bit RSA key and posted to an attacker-controlled domain mimicking litellm infrastructure.

A human noticed anomalous behavior. At human speed. The malicious versions existed for 2-3 hours on PyPI before quarantine, but litellm gets 3.4 million downloads per day. Even that short window represents significant exposure.

The detection that saved the ecosystem was one person noticing unexpected RAM usage and choosing to investigate rather than dismiss it. That's layer one automotive governance — a conscious agent noticing something wrong. But it operated at human speed, hours after the fact, and only because someone happened to be paying attention. In a system with computable accountability, the behavioral anomaly (unexpected file reads, unexpected network connections, unexpected resource consumption) would have been detected at decision speed — at the moment the malicious code first executed, not hours later when a human noticed a symptom.

The Deeper Reframe: Nothing to Steal

But detection speed is only half the story. The more fundamental question is: why were there credentials to steal in the first place?

The LiteLLM attack harvested API keys, OAuth tokens, and cloud credentials — static secrets stored in environment variables and config files. Steal one secret, gain full access to everything that secret unlocks. This is the centralized trust model: a single credential represents the entire trust relationship.

In Web4, there are no credentials to steal — not in that sense. Identity is cryptographic, multi-factor, and structural:

LCT-based identity is permanently bound, non-transferable, and optionally hardware-anchored. There is no “key” that, if copied, grants access. The identity is the entity's witnessed history — you can't steal someone's reputation by copying a file.
Session keys are scoped and ephemeral. Even if an attacker captures a single session key, it exposes one link in a multidimensional ontology — one relationship, in one context, for one time window. The rest of the trust graph is unaffected.
The witness network identifies anomalies structurally. A compromised link behaves differently from a legitimate one. The entities that witness its behavior detect the deviation as a matter of routine — not because an alert was configured, but because behavioral consistency is what trust is measured on. Inconsistent behavior automatically degrades T3, which automatically restricts capability, which automatically isolates the compromised link.

The compromised link is identified, isolated, and its trust degraded — all as routine operation of the trust mechanics, not as an incident response. The system doesn't need to know it's under attack. It just needs to measure behavior.

This makes supply chain attacks structurally expensive with very little return. The cost of compromising a component is high (it must pass initial behavioral scrutiny). The return is low (one scoped session, not a master key). And the detection is automatic (behavioral deviation degrades trust). Governance by setting the cost/benefit calculation in favor of coherence.

The Fundamental Difference

The LiteLLM attack succeeded because trust was binary (installed = trusted), static (granted at install, never reassessed), and unwitnessed (no behavioral observation between install and discovery).

Computable accountability makes trust continuous (earned from every action), multidimensional (capability, consistency, and value are separate measurements), and witnessed (other entities observe and attest).

This doesn't make supply chain attacks impossible. But it makes them expensive, observable, and fast to detect — because the first anomalous behavior triggers a trust response, not the eventual discovery months later that credentials were stolen.

The immune system doesn't prevent every pathogen from entering the body. It makes pathogenic behavior structurally costly to sustain. That's the model.

Update: Vercel (April 20, 2026)

As this presentation was being prepared, the pattern repeated. Vercel — the platform hosting thousands of production web applications, including several of ours — disclosed a breach with the same structural shape.

The attack chain: a third-party AI tool called Context AI was compromised via credential-stealing malware. A Vercel employee had connected Context AI to their corporate Google account via OAuth. The attacker used that OAuth trust relationship to pivot from Context AI into the employee's Google Workspace, then into Vercel's internal systems, then into customer environment variables that weren't marked “sensitive.”

Three governance failures, each compounding:

OAuth as binary trust. The employee authorized Context AI once. That authorization granted ongoing access with no behavioral reassessment. The trust was a credential, not a continuous measurement — exactly the LiteLLM pattern, but at the identity layer instead of the package layer.
Lateral movement via trust relationships. OAuth trust chains create implicit paths between systems. Context AI had no business accessing Vercel's infrastructure, but the OAuth link made it structurally possible. No perimeter defense caught it because the access was “authorized” — the authorization was compromised, but the system couldn't distinguish between legitimate and illegitimate use of a valid token.
“Sensitive” as a checkbox. Environment variables marked sensitive were encrypted and survived the breach. Variables not marked sensitive were readable in plaintext. The governance boundary was a developer's classification decision at write time — not a structural property of the system. One checkbox, unchecked, exposed credentials.

In a trust-native architecture, the OAuth connection itself would carry a T3 profile. Context AI connecting to a Workspace account is a relationship — it starts at zero trust, earns trust from observed behavior, and has a ceiling determined by its anchor type. A software-only OAuth token caps at 0.4. It can read calendar events; it cannot pivot into infrastructure. The ceiling is architectural, not a classification checkbox.

The behavioral anomaly — Context AI suddenly accessing systems it never accessed before — is precisely what trust-continuous monitoring detects. Not as a SIEM rule. As a property of how trust works: unexpected behavior degrades T3, degraded trust restricts capability, restricted capability limits blast radius. The immune system response, again.

Beyond AI

This case study is about a Python package, but the pattern is universal. Any system that grants trust at installation and never reassesses it — whether it's a software dependency, an employee badge, an API key, or a vendor contract — has the same vulnerability. The entity was trusted because of what it claimed to be, not because of what it was observed doing.

Computable accountability doesn't care what you claim. It cares what you do. And it remembers — without convenient edits.